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PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
OF PREPARING PLANT ARTIFICIAL CHROMOSOMES 

RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. Provisional Application No. 
5 60/294,687, filed May 30, 2001, by CARL PEREZ AND STEVEN 

FABUANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF 
AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES and 
to U.S. Provisional Application No. 60/296,329, filed June 4, 2001, by CARL 
PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL 

10 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT 
ARTIFICIAL CHROMOSOMES. This application is related to U.S. Provisional 
Application No. 60/294,758, filed May 30, 2001, by EDWARD PERKINS et 
aL. entitled CHROMOSOME-BASED PLATFORMS and to U.S. Provisional 
Application No. 60/366,891, filed March 21, 2002, by by EDWARD 

15 PERKINS et aL. entitled CHROMOSOME-BASED PLATFORMS. This 

application is also related to U.S. Provisional Application Attorney Docket 
No. 24601-420, filed May 30, 2002, by EDWARD PERKINS et aL. entitled 
CHROMOSOME-BASED PLATFORMS and to PCT International Patent 
Application Attorney Docket No. 24601 -420PC, filed May 30, 2002, by 

20 EDWARD PERKINS etaL. entitled CHROMOSOME-BASED PLATFORMS. 
This application is related to U.S. application Serial No. 08/695,191, filed 
August 7, 1 996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,025,155. 

25 This application is also related to U.S. application Serial No. 08/682,080, 

filed July 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,077,697. 
This application is also related U.S. application Serial No. 08/629,822, filed 

30 April 10, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
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ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned), and is also 
related to copending U.S. application Serial No. 09/096,648, filed June 12, 
1998, by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
5 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 09/835,682, 
April 10, 1997 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned). This 

10 application is also related to copending U.S. application Serial No. 
09/724,726, filed November 28, 2000, U.S. application Serial No. 
09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2000, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 09/836,911, 

15 filed April 17, 2001, and U.S. application Serial No. 10/125,767, filed April 
17, 2002, each of which is by GYULA HADLACZKY and ALADAR SZALAY, 
and is entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. This application 
is also related to International PCT application No. WO 97/40183. Where 

20 permitted the subject matter of each of these applications is incorporated by 
reference in its entirety. 
FIELD OF THE INVENTION 

Artificial chromosomes and methods of producing artificial 
chromosomes, particularly for use in delivery of nucleic acids and expression 

25 thereof in plants are provided. Also provided are methods of use of artificial 
chromosomes in the delivery of nucleic acids to host cells, including plant 
cells, and the expression of the nucleic acids therein. The resulting plant 
cells, tissues, organs and whole plants containing the artificial chromosomes, 
plant cell-based methods for production of heterologous proteins and 
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methods of producing transgenic organisms, particularly plants, using the 
artificial chromosomes are provided. 
BACKGROUND OF THE INVENTION 

The stable transfer of nucleic acids into plant cells and the expression 
5 of the nucleic acids therein poses many challenges. Many efforts at the 
stable introduction of nucleic acids into plant cells have utilized 
Agrobacterium-\ned\ated transformation. Agrobacterium is a free-living 
Gram-negative soil bacterium. Virulent strains of this bacterium are able to 
infect plant tissue and induce the production of a neoplastic growth 

10 commonly referred to as a crowngall. Virulent strains of Agrobacterium 
contain a large plasmid DNA known as a Ti-plasmid that contains genes 
required for DNA transfer (vir genes) and replication as well as a region of 
DNA that is transferred to plant cells called T-DNA. The T-DNA region is 
bordered by T-DNA border sequences that are crucial to the DNA transfer 

15 process. These T-DNA border sequences are recognized by the vir genes 
encoded on the Ti-plasmid and the vir genes are responsible for the DNA 
transfer process. 

Most wild-type Agrobacterium have a relatively broad dicot plant host 
range and are capable of transferring T-DNA regions up to 25 kilobases of 

20 DNA (e.g., nopaline strains) or more (e.g., octopine strains). Accordingly, 
numerous methods of using Agrobacterium to transfer DNA into plant cells 
have been developed based on the engineering of the Ti-plasmid to no longer 
contain the genes responsible for a\tered morphology and replacing these 
genes with a recombinant gene encoding a trait of interest. There are two 

25 primary types of Agrobacterium-based plant transformation systems, binary 
[see, e.g., U.S. Patent No. 4,940,838] and co-integrate [see, e.g., Fraley et 
al. (1985) Biotechnology 3:629-635] methods. The T-DNA border repeats 
are maintained in both systems and the natural DNA transfer process is used 
to transfer the portion of DNA located between the T-DNA borders into the 

30 plant cell. 



WO 02/096923 



PCT/US02/17451 



Another plant cell transformation system, termed biolistics, involves 
the bombardment of plant cells with microscopic particles coated with DNA 
encoding a new trait. The particles are rapidly accelerated, typically by gas 
or electrical discharge, through the cell wall and membranes, whereby the 
5 DNA is released into the cell and is incorporated into the genome of the cell. 
This method is used for transformation of many crops, including corn, wheat, 
barley, rice, woody tree species and others. 

A significant number of crop species of commercial interest have been 
transformed using either Agrobacter/um-rnediated or biolistic systems. 

10 However, these methods have many limitations that limit their utility. For 
example, there are limits to the size of the heterologous DNA that can be 
transferred using these methods; typically, only one to two genes may be 
transferred. Thus, although these methods may have utility in producing 
crop products modified to contain a single new trait, such as insect or 

15 herbicide tolerance, they may not be sufficient to transfer DNA that will 
provide for multiple traits, or very large DNA segments encoding a 
multiplicity of traits. 

In addition, the genetically modified plant cells produced by these 
methods tend to contain the transferred DNA in euchromatic regions of the 

20 genomic DNA. Typically, a large number of independent transgenic insertion 
events must be screened before a suitable event (such as insertion of a gene 
into the host genomic DNA such that it provides a sufficient level of gene 
expression within temporal and spatial expectations and without evidence of 
gene rearrangement) is identified. 

25 Another limitation of these methods is the effort required to utilize 

them in the genetic modification of many commercially important crops. For 
example, transformation efficiency can vary with the crop and can be low, 
notably in cereal crops such as corn and wheat. Often the inserted genes 
are rearranged and unstable over generations. 
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Furthermore Agrobacterium tumefaciens relies on host-parasite 
interaction in order to be successful. This has the effect that Agrobacterium 
has a preference for some dicots, while other dicots, monocots and conifers 
are resistant to transformation via Agrobacterium. Self-replicating vectors 
5 have also been used in the transfer of nucleic acids into plant cells. Such 
episomal vectors contain DNA sequences that are required for DIMA 
replication and sustainability of the vector in a living cell. In higher plants, 
very few episomal vectors have been developed. These episomal vectors 
have the drawback of having a very limited capacity for carrying genetic 

lO information and are unstable. One example of an episomal plant vector is 
the Cauliflower Mosaic Virus [Brisson et al. (1984) Nature 3/0:511]. 

Limitations of these gene delivery technologies necessitate the 
development of alternative vector systems suitable for transferring large (up 
to Mb size or larger) genes, gene complexes, and multiple genes together 

15 with regulatory elements for safe, controlled, and persistent expression of 
the desired genetic material in higher organisms, particularly plants, without 
rearrangement caused by insertion or mutagenesis. Therefore, it is an object 
herein to provide artificial chromosomes for the introduction of large nucleic 
acids into eukaryotic cells and methods using the artificial chromosomes, 

20 particularly for the introduction and expression of nucleic acids in plants. 
SUMMARY OF THE INVENTION 

Provided herein are plant artificial chromosomes and methods for 
producing plant artificial chromosomes. The artificial chromosomes are fully 
functional stable chromosomes. Plant artificial chromosomes provided herein 

25 have a particular composition that makes them ideal vectors for stable, 

controlled, high-level expression of heterologous nucleic acids in plant cells. 
The artificial chromosomes are capable of independent, extra-genomic 
maintenance, replication and segregation within cells and can carry multiple, 
large heterologous genes. 
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Artificial plant chromosomes provided herein are non-natural 
chromosomes that exhibit an ordered segmentation that distinguishes them 
from naturally occurring chromosomes. The segmented appearance can be 
visualized using a variety of chromosome analysis techniques and correlates 
5 with the unique structure of these artificial chromosomes, which, in 

particular methods of producing these chromosomes, can arise through 
amplification of chromosomal segments (i.e., amplification-based artificial 
chromosomes). The artificial chromosomes, throughout the region or regions 
of segmentation, are predominantly made up of one or more nucleic acid 

10 units that is (are) repeated in the region (referred to as the repeat region) and 
that have a similar gross structure. Repeats of a nucleic acid unit tend to be 
of similar size and share some common nucleic acid sequences, for example, 
a replication site involved in amplification of chromosome segments and/or 
some heterologous nucleic acid. Although the size of a repeating nucleic 

15 acid unit can vary, typically they tend to be greater than about 100 kb, 

greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. Typically, repeats of a nucleic acid unit are 
substantially similar in nucleic acid composition and can be nearly identical. 
The common nucleic acid sequences can contain sequences that represent 

20 euchromatic and heterochromatic nucleic acid. The composition of the 

amplification-based artificial chromosomes can be such that substantially the 
entire chromosome exhibits a segmented appearance or such that only one 
or more portions that make-up less than the entire chromosome appear 
segmented. 

25 The composition of the plant artificial chromosomes provided herein 

can vary. For example, in some of the artificial chromosomes provided 
herein, the repeat region or regions can be made up predominantly of 
heterochromatic DNA (i.e., the repeat region or regions contain more 
heterochromatic DNA than other types of DNA, e.g., euchromatic DNA). In 

30 other artificial chromosomes provided herein, the repeat region or regions can 
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be made up predominantly of euchromatic DNA (i.e., the repeat region or 
regions contain more euchromatic DNA than other types of DNA, e.g., 
heterochromatic DNA) or can be made up of substantially equivalent 
amounts of heterochromatic and euchromatic DNA, e.g., about 40% to 
5 about 50% of one type of nucleic acid and about 50% to about 60% of the 
other type of nucleic acid. The repeat region or regions thus can be entirely 
heterochromatic (while still containing one or more heterologous genes), or 
can contain increasing amounts of euchromatic DNA, such that, for example, 
the region contains about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 

10 90% or greater than 90% euchromatic DNA. Common nucleic acid 

sequences within repeated nucleic acid units in a repeat region can contain 
DNA that represents euchromatic nucleic acid and DNA that represents 
heterochromatic nucleic acid. Because the entire artificial chromosome can 
be made up predominantly of a repeat region or regions {e.g., the 

1 5 composition of the chromosome is such that the repeat region or regions 
make up greater than about 50% or greater than about 60% of the 
chromosome), it is thus possible for the artificial chromosome to be made up 
predominantly of heterochromatin or euchromatin, or to be made up of 
substantially equivalent amounts of heterochromatin and euchromatin, e.g., 

20 about 40% to about 50% of one type of nucleic acid and about 50% to 
about 60% of the other type of nucleic acid. Plant artificial chromosomes 
provided herein can be isolated or contained within cells or vesicles. 

Also provided herein are cells containing plant artificial chromosomes 
as described herein, including plant cells and animal cells. Included among 

25 the cells containing the plant artificial chromosomes are any cells that include 
one or more plant chromosomes. Included, for example, are plant cells, 
including plant protoplasts, in culture and within plant tissues, organs, seeds, 
pollen or whole plants. Plant cells containing the plant artificial 
chromosomes can be from any type of plant, including monocots and dicots. 

30 For example, the plant cells can be from Arabidopsis, Nicotiana, Solarium, 
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Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Trftfcum, Hef/anthus, 
Oryza, Glycine (soybean), gossypium (cotton). Also contemplated are 
mammalian and other animal cells that contain plant ACs 

Plant cells containing artificial chromosomes of any species are also 
5 provided herein. Thus, for example, such plant cefls can contain an artificial 
chromosome containing an animal, e.g., mammalian, centromere or an insect 
or avian centromere. Included among the artificial chromosomes contained 
within plant cells as provided herein are predominantly heterochromatic 
[formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 

10 U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183], minichromosomes which contain a de novo 
centromere, artificial chromosomes containing one or more regions of 
repeating nucleic acid units wherein the repeat region(s) contain substantially 
equivalent amounts of euchromatic and heterochromatic nucleic acid and in 

15 vitro assembled artificial chromosomes, each from any species. An 
exemplary artificial chromosome is a mammalian satellite artificial 
chromosome containing a mouse centromere. Included among the plant cells 
containing artificial chromosomes of any species are plant cells, including 
plant protoplasts, in culture and within plant tissues, organs, seeds, pollen or 

20 whole plants. Plant cells containing the artificial chromosomes can be from 
any type of plant, including monocots and dicots. For example, the plant 
cells can be from Arabidopsis, Nicotiana, Solanum, Lycopersicon, Daucus, 
Hordeum, Zea mays, Brassica, Triticum, Helianthus and Oryza. 

Further provided herein are methods of producing plant artificial 

25 chromosomes. One embodiment of these methods includes the steps of 
introducing nucleic acid into a cell containing plant chromosomes and 
selecting a cell containing an artificial chromosome that contains one or more 
repeat regions in which one or more nucleic acid units is (are) repeated. The 
repeats of a nucleic acid unit in a repeat region can contain common nucleic 

30 acid sequences and can be substantially identical. In some embodiments of 
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this method, the repeat region(s) of the artificial chromosome contain 
substantially equivalent amounts of euchromatic and heterochromatic nucleic 
acid. The artificial chromosome can be predominantly made up of one or 
more repeat regions. In further embodiments of this method, the artificial 
5 chromosome is made up of substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. In further embodiments of this method, 
the repeats of a nucleic acid unit have common nucleic acid sequences 
which contain sequences that represent euchromatic and heterochromatic 
nucleic acid. 

10 Any cell containing plant chromosomes can be used in these 

embodiments of methods of producing plant artificial chromosomes described 
herein. For example, the cell can be any cell that contains chromosomes 
from Arabidopsis, tobacco, So/anurn, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Oryza, Capsicum, lentil and/or Heiianthus, including 

15 cells or protoplasts of Arabidopsis, tobacco and/or Heiianthus. 

The nucleic acid that is introduced into a cell containing plant 
chromosomes in methods of producing a plant artificial chromosome as 
provided herein can be any nucleic acid, including, but not limited to, satellite 
DNA, rDNA and lambda phage DNA. Satellite DMA and rDNA includes such 

20 DNA from plants, such as, for example, Arabidopsis, Nicotiana, Sofanum, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza, 
and from animals, such as mammals. The rDNA can contain sequences of 
an intergenic spacer region, such as can be obtained, for example, from DNA 
of Arabidopsis, Sofanum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, 

25 radish and mung bean. In some embodiments of the method, the nucleic 

acid contains a nucleic acid sequence that facilitates amplification of a region 
of a p/ant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

In further embodiments of methods of producing plant artificial 
30 chromosomes provided herein, the nucleic acid that is introduced into a cell 
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containing one or more plant chromosomes includes nucleic acid that for 
identification of cells containing the nucleic acid. Such nucleic acids include 
nucleic acid encoding a fluorescent protein, such as a green, blue or red 
fluorescent protein, and nucleic acid encoding a selectable marker, such as, 
5 for example, proteins that confer resistance to phosphinothricin, ammonium 
glufosinate, glyphosate, kanamycin, hydromycin, dihydrofolate or 
sulfonylurea. 

In embodiments of methods of producing plant artificial chromosomes 
in which nucleic acid is introduced into a cell containing one or more plant 

10 chromosomes, the cell can be cultured through two or more cell doublings, 
and typically from about 5 to about 60, or about 5 to about 55, or about 10 
to about 55, or about 25 to about 55, or about 35 to about 55 cell doublings 
following introduction of nucleic acid into a cell. The step of selecting a cell 
containing a plant artificial chromosome can include sorting of cells into 

15 which nucleic acid was introduced. For example, cells can be sorted on the 
basis of the presence of a selectable marker, such as a reporter protein, or 
by growing (culturing) the cells under selective conditions. The selection 
step can include fluorescent in situ hybridization (FISH) analysis of cells into 
which nucleic acid is introduced. 

20 Also provided are methods of producing a transgenic plant using 

artificial chromosomes that function in plants and transgenic plants 
containing artificial chromosomes. Artificial chromosomes used in the 
methods of producing transgenic plants can be of any species. For example, 
the artificial chromosomes can contain a centromere from species such as 

25 animals, e.g., mammals, birds, plants, or insects, that functions to segregate 
nucleic acids to daughter cells through cell division. In some embodiments 
of the methods for producing a transgenic plant, the artificial chromosomes 
contain repeat regions predominantly made up of repeats of one or more 
nucleic acid units. Repeats of a nucleic acid unit can share some common 

30 nucleic acid sequences, for example, a replication site involved in 
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amplification of chromosome segments and/or some heterologous nucleic 
acid. Repeats of a nucleic acid unit can be substantially identical. Common 
nucleic acid sequences of repeats of a nucleic acid unit can contain 
sequences that represent euchromatic and heterochromatic nucleic acid. 
5 Repeat regions of artificial chromosomes that can be used in the 

methods of producing a transgenic plant can be made up of substantially 
equivalent amounts of heterochromatic and euchromatic DNA or can be 
made up predominantly of heterochromatic DNA or can be made up 
predominantly of euchromatic DNA. The artificial chromosome can be made 

10 up predominantly of heterochromatic or euchromatic DNA or can be made up 
of substantially equivalent amounts of heterochromatin and euchromatin. 
Such artificial chromosomes that contain plant centromeres can contain a 
plant centromere from any species of plant, including monocots and dicots. 
For example, the centromere can be from Arabidopsis, tobacco, Helianthus, 

15 Solarium, Lycopersicon, Daucus, Hordeum, Zea, Brassfca, Triticum, rye, 
wheat, radish, mung bean or Oryza* The artificial chromosomes can be 
made using methods described herein. 

In a method of producing a transgenic plant provided herein, an 
artificial chromosome, such as those described above and elsewhere herein, 

20 is introduced into a plant cell. The artificial chromosome can contain 

heterologous nucleic acid encoding a gene product such as, for example, an 
enzyme, antisense RNA, tRNA, rDNA, a structural protein, a marker or 
reporter protein, a ligand, a receptor, a ribozyme, a therapeutic protein, a 
biopharmaceutical protein, a vaccine, a blood factor, an antigen, a hormone, 

25 a cytokine, a growth factor or an antibody. The product can be one that 

provides for resistance to diseases, insects, herbicides or stress in the plant. 
The product can be one that provides for an agronomically important trait in 
the plant and/or that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. Heterologous nucleic acid of an artificial 
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chromosome can be contained within a bacterial artificial chromosome (BAC) 
or a yeast artificial chromosome (YAC). 

The plant cell Into which such artificial chromosomes can be 
introduced in methods of producing a transgenic plant provided herein can be 
5 any species of plant cell, including, but not limited to, Arabtdopsis , tobacco, 
Helianthus, Solatium, Lycopersicon, Daucus, Hordeum, Zea, Brassica, 
Triticum, rye, wheat, radish, mung bean, Capsicum, lentil and Oryza. Any 
cell that can develop into a plant can be used, including plant cells and 
protoplasts of plant embryos, calli, tissues, meristem, organs, seeds, 

10 seedlings, pollen, pollen tubes or whole plants. 

Artificial chromosomes can be introduced into plant cells in the 
methods of producing a transgenic plant using any process for transfer of 
nucleic acids into plant cells, including, but not limited to chemical, physical 
and electrical processes and combinations thereof. For example, the artificial 

15 chromosomes can be transferred into plant cells via direct contact in the 

absence or presence of a fusogen, e.g., polyethylene glycol (PEG), calcium 
phosphate and/or lipid or they can be encapsulated in a lipid structure (e.g., a 
liposome) or contained within a protoplast or microcell which is then allowed 
to fuse {in the presence or absence of a fusogen such as PEG) with a plant 

20 cell for introduction of the artificial chromosome into the cell in a method of 
producing a transgenic plant. Artificial chromosomes can be transferred to 
plant cells that are subjected to electrical pulses (e.g., electroporation) and/or 
ultrasound (e.g., sonoporation) before, during and/or after exposure of the 
cells to the artificial chromosomes. Use of electrical pulses and/or ultrasound 

25 can be in combination with any other agents, e.g. , PEG and/or lipids, used in 
transferring nucleic acids into plant cells. Artificial chromosomes can also be 
physically injected into plant cells through a micropipette or needle or 
introduced into plant cells through bombardment of the cells with 
microprojectiles coated with the chromosomes. To facilitate transfer of 
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nucleic acids into plant cells, the recipient cells or tissue can be subjected to 
mechanical wounding. 

Plant cells into which artificial chromosomes have been introduced for 
purposes of producing a transgenic plant are cultured under conditions that 
5 permit generation of a whole plant therefrom. The transformed cells can be 
analyzed prior to use in the generation of whole plants to determine 
suitability. For example, the cells can be analyzed for the presence of 
artificial chromosomes and/or regenerative capacity. Plant regeneration 
techniques, many of which are known to those of skill in the art, can be 
10 used to generate whole plants from, for example, cells, embryos and calli 
containing artificial chromosomes. For example, plants can be regenerated 
from cells containing artificial chromosomes by the planting of transformed 
roots, plantlets, seed, seedlings, and any structure capable of growing into a 
whole plant. 

15 Further provided herein are methods for producing an acrocentric plant 

chromosome and methods for producing plant chromosomes containing 
adjacent regions of rDNA and heterochromatin, in particular, pericentric 
and/or satellite heterochromatin. Also provided herein are methods for 
generating acrocentric plant chromosomes containing adjacent regions of 

20 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

One embodiment of these methods includes steps of introducing 
nucleic acid containing two site-specific recombination sites into a cell 
containing one or more plant chromosomes, recombining nucleic acids of the 

25 two site-specific recombination sites, and selecting a cell containing an 
acrocentric plant chromosome and/or a plant chromosome containing 
adjacent regions of rDNA and heterochromatin. The two site-specific 
recombination sites can be contained on separate nucleic acid fragments 
which are introduced into the cell simultaneously or sequentially. 
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Other embodiments of the methods of producing an acrocentric plant 
chromosome and/or a plant chromosome that contains adjacent regions of 
rDNA and heterochromatin include steps of introducing a first nucleic acid 
containing a site-specific recombination site into a first plant chromosome, 
5 introducing a second nucleic acid containing a site-specific recombination 
site into a second plant chromosome, recombining nucleic acids of the first 
and second chromosomes and selecting a plant chromosome that is 
acrocentric or that contains adjacent regions of rDNA and heterochromatin. 
For example, to produce an acrocentric plant chromosome, the first nucleic 

10 acid can be introduced into or adjacent to the pericentric heterochromatin of 
the first chromosome and/or the second nucleic acid can be introduced into 
the distal end of the arm of the second chromosome. To produce an 
acrocentric plant chromosome containing adjacent regions of rDNA and 
heterochromatin, for example, the first nucleic acid can be introduced into or 

15 adjacent the pericentric heterochromatin on the short arm of an acrocentric 
plant chromosome and the second nucleic acid can be introduced into or 
adjacent to rDNA. To produce a plant chromosome containing adjacent 
regions of rDNA and heterochromatin, for example, the first nucleic acid can 
be introduced into or adjacent to heterochromatin, such as pericentric 

20 heterochromatin or satellite DNA, and the second nucleic acid can be 

introduced into or adjacent to rDNA. When the chromosomes are located 
within a cell, the method can include selecting a cell containing a plant 
chromosome that is acrocentric and/or that contains adjacent regions of 
rDNA and heterochromatin. 

25 Another embodiment of the methods of producing an acrocentric plant 

chromosome includes steps of introducing a first nucleic acid containing a 
site-specific recombination site into the pericentric heterochromatin of a plant 
chromosome, introducing a second nucleic acid containing a site-specific 
recombination site into the distal end of the chromosome in which the first 

30 and second recombination sites are located on the same arm of the 
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chromosome, recombining nucleic acids of the first and second 
recombination sites in the chromosome and selecting a plant chromosome 
that is acrocentric. 

Another method of producing an acrocentric plant chromosome or a 
5 plant chromosome containing adjacent regions of rDNA and heterochromatin 
includes steps of introducing nucleic acid containing a recombination site 
adjacent to or sufficiently near nucleic acid encoding a selectable marker into 
a first plant cell for recombination and introduction of the marker into the 
chromosome, generating a first transgenic plant from the first plant cell, 

lO introducing nucleic acid containing a promoter functional in a plant cell and a 
recombination site in operative linkage into a second plant cell, generating a 
second transgenic plant from the second plant cell, crossing the first and 
second plants, obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and selecting a 

15 resistant plant that contains cells containing an acrocentric plant 

chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin. Methods of this embodiment can optionally include 
steps of selecting first and second transgenic plants such that one of the 
plants contains a chromosome containing a recombination site in a region 

20 within or adjacent to the pericentric heterochromatin and the other plant 
contains a chromosome containing a recombination site located within or 
adjacent to rDNA of the chromosome. These methods can further include 
the steps of selecting first and second transgenic plants where one of the 
plants contains a chromosome containing a recombination site located on a 

25 short arm of the chromosome in a region adjacent to the pericentric 
heterochromatin; and 

the other plant contains a chromosome containing a recombination site 
located in rDNA of the chromosome. In one embodiment, the recombination 
sites on the two chromosomes are in the same orientation. 
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ln methods of producing an acrocentric plant chromosome, one or 
both of these recombination sites is located on a short arm of the 
chromosome. For example, one of the one of the plants contains a 
chromosome containing a recombination site in region within or adjacent to 
5 the pericentric heterochromatin located on the short arm of the chromosome. 
The selecting steps can further include selecting first and second transgenic 
plants such that the recombination sites on the two chromosomes are in the 
same orientation. 

In any of these methods of producing an acrocentric plant 

10 chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin (in particular, pericentric heterochromatin and/or 
satellite DNA), recombination between the first and second site-specific 
recombination sites can be provided for in a number of ways. For example, a 
recombinase activity can be introduced into a cell containing one or more 

15 chromosomes containing the sites which catalyzes the recombination 

reaction. The recombinase activity can be encoded by nucleic acid that is 
introduced into the cell simultaneously with nucleic acid containing a site- 
specific recombination site or that is introduced into the cell at a different 
time. Recombinase activity occurs within the cell upon expression of the 

20 nucleic acid encoding a recombinase activity, which can be operatively linked 
to a promoter functional in the cell. The recombinase activity can be 
constitutively expressed or can be induced, for example, by linking the 
nucleic acid encoding the recombinase to an inducible promoter. It is also 
possible that a cell into which nucleic acid containing site-specific 

25 recombination sites is introduced contains a recombinase enzyme which can 
be constitutively or inducibly expressed. Alternatively, a transgenic plant can 
be generated from cells containing the recombination sites and crossed with 
a transgenic plant containing nucleic acid encoding a recombinase. 

Any site-specific recombinase system known to those of skill in the 

30 art is contemplated for use herein. It is contemplated that one or a plurality 
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of sites that direct the recombination by the recornbinase are introduced into 
the ACes (or other ACs) and then heterologous genes linked to the cognate 
site are introduced into an ACes to produce platform ACes. The resulting 
ACes are introduced into cells with nucleic acid encoding the cognate 
5 recornbinase, typically on a vector, and nucleic acid encoding heterologous 
nucleic acid of interest linked to the appropriate recombination site for 
insertion into the ACes chromosome. The recornbinase encoding nucleic 
acid may be introduced into the AC, includes ACes, or on the same or a 
difference vector from the heterologous nucleic acid. 

10 For the methods herein any recornbinase enzyme that catalyzes site- 

specific recombination can be used to facilitate recombination between the 
first and second site-specific recombination sites. A variety of recombinases 
and attachment/recombination sites therefor are available and/or known to 
those of skill in the art. These include, but not limited to: the CreJ/ox 

15 recombination system using CRE recornbinase from the Escherichia coli 

phage P1 , the FLP/FRT system of yeast using the FLP recornbinase from the 
2jj episome of Saccharomyces cerevisiae, the resolvases, including Gin 
recornbinase of phage Mu, Cin, Hin, a6 Tn3; the Pin recornbinase of E. cofi, 
the R/RS system of the pSR1 plasmid of Zygosaccharomyces rouxii site 

20 specific recombinases from Kluyveromyces drosopbilartum and 
Kluyveromyces waftii and other systems are 

Also contempalted is the E. coli phage lambda integrase system, the phage 
lambda integrase and the cognate att sites (see, also copending application 
U.S. application Serial No. (attorney docket No. 24601 -420, filed on the 

25 same day herewith)). 

In any of these methods of producing acrocentric plant chromosomes, 
nucleic acid containing a site-specific recombination site can also contain 
nucleic acid encoding a selectable marker. The nucleic acids used in the 
methods can be designed such that expression of the selectable marker 

30 occurs only upon the desired recombination event. 
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Acrocentric plant chromosomes produced by the methods provided 
herein can be of any composition. For example, the DNA of the short arm of 
the acrocentric chromosome can contain less than 5% or less than 1 % 
euchromatic DNA or can contain no euchromatic DNA. Acrocentric plant 
5 artificial chromosomes in which the short arm of the acrocentric chromosome 
does not contain euchromatic DNA are provided. 

In another embodiment, a method of producing a plant artificial 
chromosome, that includes the steps of introducing nucleic acid into a plant 
cell acrocentric chromosome in which the short arm does not contain 
10 euchromatic DNA; culturing the cell through at least one cell division; and 
selecting a cell containing an artificial chromosome, such as one that is 
predominantly heterochrornatic, is provided. The acrocentric chromosome is 
produced by the method of any the methods described herein or other 
suitable methods. 

15 In another embodiment, a method for producing an artificial 

chromosome, that includes the steps of introducing nucleic acid into a plant 
cell; and 

selecting a plant cell that includes an artificial chromosome that contains one 
or more repeat regions is provided. In this AC, one or more nucleic acid 

20 units is (are) repeated in a repeat region; repeats of a nucleic acid unit have 
common nucleic acid sequences; and the common sequences of 
nucleotides include sequences that represent euchromatic and 
heterochrornatic nucleic acid. The nucleic acid can include plant rDNA from 
a dicot plant species or plant rDNA from a monocot plant species. The 

25 intergenic spacer region can be from DNA from a Nicotiana plant or other 
suitable source of such DNA. The rDNA can be plant rDNA, and the plant 
can be a dicot or a monocot. 

Also provided are isolated plant artificial chromosomes that contain 
one or more repeat regions. In these ACs one or more nucleic acid units is 

30 (are) repeated in a repeat region; repeats of a nucleic acid unit have common 
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nucleic acid sequences; and the common sequences of nucleotides include 
sequences that represent euchromatic and heterochromatic nucleic acid. The 
artificial chromosome can be produced by a method that includes the steps 
of: introducing nucleic acid into a plant cell; and selecting a plant cell 
5 containing an artificial chromosome that contains one or more repeat regions. 
The repeats of a nucleic acid unit have common nucleic acid sequences; and 
the common nucleic acid sequences contain sequences that represent 
euchromatic and heterochromatic nucleic acid. 

In another embodiment, another method for producing an acrocentric 

10 plant chromosome is provided. The method includes the steps of: 

introducing nucleic acid containing two site-specific recombination sites into 
a cell containing one or more plant chromosomes; introducing into the cell a~ 
recombinase activity that catalyzes recombination between the two 
recombination sites to produce a plant acrocentric chromosome. In the 

1 5 embodiment, the two site-specific recombination sites can be on separate 
nucleic acid fragments, which optionally can be introduced into the cell 
simultaneously or sequentially. The resulting artificial chromosome can be 
one that is predominantly heterochromatic. 

In another embodiment, a method of producing a plant artificial 

20 chromosome is provided. The method includes the steps of: introducing 
nucleic acid into a plant chromosome, such as but not limited to, an 
acrocentric chromosome, in a ceil that contains adjacent regions of rDNA and 
heterochromatic DNA; culturing the cell through at least one cell division; 
and selecting a cell containing an artificial chromosome. The resulting 

25 artificial chromosome can be predominantly heterochromatic. The 

acrocentric chromosome can be one where the short arm of the chromosome 
contains adjacent regions of rDNA and heterochromatic DNA, such as, but 
not limited to, pericentric heterochromatin. 

Also provided are a variety of vectors. Among these are vectors 

30 containing nucleic acid encoding a selectable marker that is not operably 
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associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
5 a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. Exemplary of such vectors is pAglla and pAgllb. 

Another vector provided herein contains nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, wherein 
the selectable marker permits growth of animal cells in the presence of an 
10 agent normally toxic to the animal cells; and wherein the agent is not toxic to 
plant cells; a recognition site for recombination; and nucleic acid encoding a 
protein operably linked to a plant promoter. Exemplary of these vectors is 
pAg1 and pAg2. 

Another vector that is provided contains: nuclfeic acid encoding a 

15 selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells but not toxic to animal cells; a 
recognition site for recombination; and nucleic acid encoding a protein 
operably linked to a plant promoter. 

20 Another vector is a plant transformation vector that contains nucleic 

acid encoding a recognition site for recombination; a sequence of nucleotides 
that facilitates or causes amplification of a region of a plant chromosome; 
one or more selectable markers that are expressed in plant cells to permit the 
selection of cells containing the vector, and Agrobacterium nucleic acid. The 

25 vector is for Agrobacter/um-mediated transformation of plants. 

Another vector that is provided contains a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome, wherein the plant is selected from the group 

30 consisting of Arabidopsis, Nicotiana, Solatium, Lycopersicon, Daucus, 
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Hordeum, Zea mays, Brassica, Triticum, Helianthus, soybean, cotton and 
Oryza. 

In these vectors, the amplifiable region can contain heterochromatic 
nucleic acid; the amplifiable region can contain rDNA. Exemplary sequences 
5 of nucleotides that facilitates amplification of a region of a plant chromosome 
or targets the vector to an amplifiable region of a plant chromosome are any 
that contain a sufficient portion of an intergenic spacer region of rDNA to 
facilitate amplification or effect the targeting. Such sufficient portion can be 
at least 14, 20, 30, 50, TOO, 150, 300, 500, 1 kB, 2 kB, 3 kB, 5 kB, 10 kB 

10 or more contiguous nucleotides from an intergenic spacer region and/or other 
rDNA region. An exemplary selectable marker encodes a product confers 
resistance to zeomycin. The protein in the vectors include a protein that is a 
selectable marker that permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells, such as, for example, resistance to 

15 hygromycin or to phosphothricin. Other such protein markers include, but 
are not limited to, fluorescent proteins, such as, for example, green, blue 
and red fluorescent proteins. An exemplary recognition site contains an att 
site. Exemplary promoters for inclusion in the vectors, include, but are not 
limited to, nopaline synthase <NOS) or CaMV35S. 

20 Cell, containing any of the vectors or mixtures thereof are provided. 

The cells include any cells that have at least one plant chromosome, such as 
a plant cell. The cells can be protoplasts. 

Methods using these vectors are provided. The methods includes a 
step of introducing one of the vectors into a cell, such as a cell that 

25 contains at least one plant chromosome. Such vector is for example, a 
vector that contains nucleic acid encoding a selectable marker that is not 
operably associated with any promoter, where the selectable marker permits 
growth of animal cells in the presence of an agent normally toxic to the 
animal cells but is not toxic to plant cells; a recognition site for 

30 recombination; and 
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nucleic acid encoding a protein operably linked to a plant promoter. In this 
method, the cell contains an animal, such as a mammal, platform ACes that 
contains a recognition site, such as, for example, an att site, that recombines 
with the recognition site in the vector in the presences of the recombinase 
5 therefor, thereby incorporating the selectable marker that is not operably 
associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. The platform ACes can contain a promoter that, 
upon recombination, is operably linked to the selectable marker that in the 

10 vector is not operably associated with a promoter. The method can further 
include transferring the resulting platform ACes into a plant cell to produce a 
plant cell that contains the platform Aces. The method optionally further 
includes culturing the plant cell that contains the platform Aces under 
conditions whereby the protein encoded by the nucleic acid that is operably 

15 linked to a plant promoter is expressed. 

The resulting platform ACes optionally is isolated prior to transfer. 
The Aces can be introduced into a plant cell by any suitable method, such as 
one selected from among protoplast transfection, lipid-mediated delivery, 
liposomes, electroporation, sonoporation, microinjection, particle 

20 bombardment, silicon carbide whisker-mediated transformation, polyethylene 
glycol (PEG)-mediated DNA uptake, lipofection and lipid-mediated carrier 
systems. The resulting platform ACes can be transferred by fusion of the 
cells, which, for example, are plant protoplasts. In another embodiment, the 
cell can be an animal cell, such as a mammalian, including human, cell. , 

25 

In another, method a vector is introduced into plant cells. Such 
vector, for example, can be a vector that includes nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
30 agent normally toxic to the animal cells but is not toxic to plant cells; a 
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recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome. The plant cells are 
cultured and a plant cell(s) containing an artificial chromosome that contains 
5 one or more repeat regions is selected. In this method, a sufficient portion of 
the vector can integrates into a chromosome in the plant cell to result in 
amplification of chromosomal DNA. The resulting selected artificial 
chromosome can be on in which one or more nucleic acid units is (are) 
repeated in a repeat region; repeats of a nucleic acid unit have common 
lO nucleic acid sequences; and the repeat region(s) contain substantially 

equivalent amounts of euchromatic and heterochromatic nucleic acid. The 
resulting artificial chromosome produced in the method optionally can be 
isolated. 

Anther method is also provided. This method includes the steps of 

15 introducing a vector into a cell, and culturing the resulting cell under 

conditions, whereby the protein encoded by nucleic acid operably linked to 
an animal promoter is expressed. In the method the vector can contains: 
nucleic acid encoding a selectable marker that is not operably associated 
with any promoter, where the selectable marker permits growth of animal 

20 cells in the presence of an agent normally toxic to the animal cells but is not 
toxic to plant cells; a recognition site for recombination; and nucleic acid 
encoding a protein operably linked to an animal promoter. The cell can 
contain a platform plant artificial chromosome (PAC) that contains a 
recombination site and an animal promoter that upon recombination is 

25 operably linked to the selectable marker that in the vector is not operably 

associated with a promoter. Introduction can be effected under conditions 
whereby the vector recombines with the PAC to produce a plant platform 
PAC that contains the selectable marker operably linked to the promoter. In 
this method, the artificial chromosome can be an ACes. In addition, the 

30 plant platform PAC can be an ACes. 
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The vectors, such as those that contain nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
agent normally toxic to the animal cells but is not toxic to plant cells; a 
5 recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome, and the plant 
transformation vectors that contain nucleic acid for Agrobacterium-medlated 
transformation of plants, can be used to produce artificial chromosomes. In 
10 one exemplary method, such vector is introduced into a cell containing one 
or more plant chromosomes; and 

a cell containing an artificial chromosome that contains one or more repeat 
regions is selected. The artificial chromosome contains one or more nucleic 
acid units that is (are) repeated in a repeat region; the repeats of a nucleic 

15 acid unit have common nucleic acid sequences; and the common nucleic acid 
sequences contain sequences that represent euchromatic and 
heterochromatic nucleic acid. In another method, a ceil containing an 
artificial chromosome that contains one or more repeat regions is selected. 
The artificial chromosome contains one or more nucleic units that is (are) 

20 repeated in a repeat region; repeats of a nucleic acid unit have common 
nucleic acid sequences; and 

the repeat region(s) contain substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. 
DESCRIPTION OF THE DRAWINGS 
25 Figure 1 provides a map of plasmid pAgl. 

Figure 2 provides a schematic representation of the construction of 
plasmid pAgl. 

Figure 3 provides a map of plasmid pAg2. 

Figure 4 provides a schematic representation of the construction of 
30 plasmid pAg2. 
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Figure 5 provides a schematic representation of the construction of 
plasmids pAglla and pAgllb. 

Figure 6A-6B provide restriction maps of the DNA inserted into pAg1 
to form plasmids pAglla and pAgllb. 
5 Figure 7 provides a map of plasmid pSV401 93attPsensePUR. 

Figure 8 depicts a method for formation of a chromosome platform 
with multiple recombination integration sites, such as attP sites. 

Figure 9 diagrammaticalty summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
lO integration site; marker 2, which is promoterless in the donor vector permits 
selection of recombinants. Upon recombination with the platform marker 2 
is expressed under the control of a promoter resident on the platform. 
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Definitions 

15 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is commonly understood by one of skill in the art 
to which this invention belongs. All patents, patent applications, published 
applications and other publications and published nucleotide and amino acid 
sequences (e.g., sequences available in GenBank or other databases) referred 

20 to herein are incorporated by reference in their entirety. Where reference is 
made to a URL or other such identifier or address, it is understood that such 
identifiers can change and particular information on the internet can come 
and go, but equivalent information can be found by searching the internet. 
Reference thereto evidences the availability and public dissemination of such 

25 information. 

As used herein, a chromosome is a defined composition of nucleic 
acid that is capable of replication and segregation within a cell upon cell 
division. Typically, a chromosome may contain a centromeric region, 
telomeric regions and a region of nucleic acid between the centromeric and 

30 telomeric regions. 
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As used herein, a centromere is a molecular composition that includes 
a nucleic acid sequence that confers an ability to segregate to daughter cells 
through cell division. A centromere may confer stable segregation of a 
nucleic acid sequence, including an artificial chromosome containing the 
5 centromere, through mitotic and/or meiotic divisions. A plant centromere is 
not necessarily derived from plants, but has the ability to promote DNA 
segregation in plant cells. 

As used herein, euchromatin and heterochromatin have their 
recognized meanings. Euchromatin refers to chromatin that stains diffusely 

TO and that typically contains genes, and heterochromatin refers to chromatin 

> 

that remains unusually condensed and that has been thought to be 
transcriptionally inactive or has low transcriptional activity relative to 
euchromatin. Highly repetitive DNA sequences (satellite DNA) are usually 
located in regions of the heterochromatin surrounding the centromere 

15 (pericentric or pericentrorneric heterochromatin). Constitutive 

heterochromatin refers to heterochromatin that contains the highly repetitive 
DNA which is constitutively condensed and genetically inactive. 

As used herein, an acrocentric chromosome refers to a chromosome 
with arms of unequal length. 

20 As used herein, endogenous chromosomes refer to genomic chromo- 

somes as found in the cell prior to generation or introduction of an artificial 
chromosome. 

As used herein, artificial chromosomes are nucleic acid molecules, 
typically DNA, that stably replicate and segregate alongside endogenous 

25 chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. A mammalian artificial chromosome 
(MAC) refers to a chromosome that has an active mammalian centromere(s). 
Plant artificial chromosomes (PAC), insect artificial chromosomes and avian 
artificial chromosomes refer to chromosomes that include centromeres that 

30 function in plant, insect and avian cells, respe ctively. Human artificial 



WO 02/096923 



PCT7US02/17451 



-27- 

chromosomes (HAC) refers to chromosomes that include centromeres that 
"function in human cells. For exemplary artificial chromosomes, see, e.g., 
U.S. Patent Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
5 International PCT application Nos, WO 97/40183 and WO 98/08964. 

As used herein, amplification, with reference to DNA, is a process in 
which segments of DNA are duplicated to yield two or multiple copies of 
substantially similar or identical or nearly identical DNA segments that are 
typically joined as substantially tandem or successive repeats or inverted 
10 repeats. 

As used herein, amplification-based artificial chromosomes are 
artificial chromosomes derived from natural or endogenous chromosomes by 
virtue of an amplification event, such as one that may be initiated by 
introduction of heterologous nucleic acid into heterochromatin, for example, 

15 pericentric heterochromatin, in a chromosome. As a result of such an event, 
chromosomes and/or fragments thereof exhibiting segmented or repeating 
patterns arise. Artificial chromosomes can be formed from these 
chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to non-natural or isolated chromosomes that exhibit an 

20 ordered segmentation that is not typically observed in naturally occurring 

chromosomes and that can be a basis for distinguishing them from naturally 
occurring chromosomes. Amplification-based artificial chromosomes can 
also be distinguished from naturally occurring chromosomes by virtue of their 
typically smaller size and often segmented appearance when visualized. The 

25 segmented appearance, which can be visualized using a variety of 

chromosome analysis techniques as described herein and known to those of 
skill in the art, correlates with the unique structure of these artificial 
chromosomes. In addition to containing one or more centromeres, the 
amplification-based artificial chromosomes, throughout the region or regions 

30 of segmentation, are predominantly made up of one or more nucleic acid 
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units, also referred to as "amplicons", that is (are) repeated in the region and 
that have a similar gross structure. Thus, a region of segmentation may be 
referred to as a repeat region. Repeats of an amplicon tend to be of similar 
size and share some common nucleic acid sequences. For example, each 
5 repeat of an amplicon may contain a replication site involved in amplification 
of chromosome segments and/or some heterologous nucleic acid that was 
utilized in the initial production of the artificial chromosome. Typically, the 
repeating units are substantially similar in nucleic acid composition and may 
be nearly identical. The common nucleic acid sequences may contain 

10 sequences that represent euchromatic and heterochromatic nucleic acid. 
Amplicon sizes vary but typically tend to be greater than about 100 kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. The composition of the amplification-based 
artificial chromosomes may be such that substantially the entire chromosome 

15 exhibits a segmented appearance or such that only one or more portions that 
make-up less than the entire chromosome appear segmented. The 
amplification-based artificial chromosomes can also differ depending on the 
chromosomal region that has undergone amplification in the process of 
artificial chromosome formation. The structures of the resulting 

20 chromosomes can vary depending upon the initiating event and/or the 

conditions under which the heterologous nucleic acid is introduced, including 
modification to the endogenous chromosomes. For example, in some of the 
artificial chromosomes provided herein, the region or regions of segmentation 
may be made up predominantly of heterochromatic DNA. In other artificial 

25 chromosomes provided herein, the region or regions of segmentation may be 
made up predominantly of euchromatic DNA or may be made up of similar 
amounts of heterochromatic and euchromatic DNA. The region or regions of 
segmentation thus may be entirely heterochromatic (while still containing one 
or more heterologous nucleic acid sequences), or may contain increasing 

30 amounts of euchromatic DNA, such that, for example, the region contains 
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about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA. Because the entire artificial chromosome can be 
made up predominantly of a region or regions of segmentation, it is thus 
possible for the artificial chromosome to be made up predominantly of 
5 heterochromatin or euchromatin, or to be made up of substantially equivalent 
amounts of heterochromatin and euchromatin, e.g., about 40% to about 
50% of one type of nucleic acid and about 50% to about 60% of the other 
type of nucleic acid. 

As used herein the term "predominantly" with respect to a 

10 composition generally refers to a state of the composition in which it can be 
characterized as being or having more of the predominant feature than other 
features which are not predominant. The predominant feature may represent 
more than about 50%, more than about 60%, more than about 70%, more 
than about 86%, more than about 90%, more than about 95% or essentially 

15 100% of the composition. Thus, for example, a repeat region that is 
predominantly made up of heterochromatic DNA contains more 
heterochromatic DNA than other types, e.g., euchromatic, of DNA. The 
repeat region may be more than about 50%, more than about 60%, more 
than about 70%, more than about 80%, more than about 90% or more than 

20 about 95% heterochromatic DNA or may be essentially 100% 

heterochromatic DNA. An artificial chromosome predominantly made up of 
heterochromatin contains more heterochromatic DNA than other types, e.g., 
euchromatic, of DNA and may be more than about 50%, more than about 
60%, more than about 70%, more than about 80%, more than about 90% 

25 or more than about 95% heterochromatic DNA or may be essentially 100% 
heterochromatic DNA. 

As used herein an amplicon is a repeated nucleic acid unit. In some of 
the artificial chromosomes described herein, an amplicon may contain a set 
of inverted repeats of a megareplicon. A megareplicon represents a higher 

30 order replication unit. For example, with reference to some of the 
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predominantly heterochromatic artificial chromosomes, particularly eukaryotic 
chromosomes, described herein, the megareplicon may contain a set of 
tandem DNA blocks (e.g., — 7.5 Mb DNA blocks) each containing satellite 
DNA flanked by non-satellite DNA or may substantially be made up of rDNA. 
5 Contained within the megareplicon is a primary replication site, referred to as 
the megareplicator, which may be involved in organizing and facilitating 
replication of segments of chromosomes, including, for example, 
heterochromatin, pericentric heterochromatin, rDNA and/or possibly the 
centromeres. Within the megareplicon there may be smaller (e.g., 5O-300 

10 kb) secondary replicons. As used herein, amplifiable, when used in 

reference to a chromosome, particularly the method of generating artificial 
chromosomes provided herein, refers to a region of a chromosome that is 
prone to amplification. Amplification typically occurs during replication and 
other cellular events involving recombination {e.g., DNA repair). Included 

15 among such regions are regions of the chromosome that contain tandem 
repeats, such as satellite DNA, rDNA, and other such sequences. 

Among the artificial chromosome systems provided herein are those 
that are predominantly heterochromatic [formerly referred to as satellite 
artificial chromosomes (SATACs); see, e.g., U.S. Patent Nos. 6,077,697 

20 and 6,025,155 and published International PCT application No. 

WO 97/40183], minichromosomes which contain a de novo centromere, 
artificial chromosomes containing one or more regions of repeating nucleic 
acid units wherein the repeat region(s) contain substantially equivalent 
amounts of euchromatic and heterochromatic nucleic acid and in vitro 

25 assembled artificial chromosomes. Of particular interest herein are artificial 
chromosomes that introduce and express heterologous nucleic acids in 
plants. These include artificial chromosomes that have a centromere derived 
from a plant, and, also, artificial chromosomes that have centromeres that 
may be derived from other organisms but that function in plants. Methods 
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for the construction, isolation, and delivery to target cells of each type of 
artificial chromosome are provided herein. 

As used herein, to target nucleic acid to a locus on a chromosome 
means that the nucleic acid integrates at or near the targeted locus. Any 
5 method or means for effecting such integration, including, but not limited to, 
homologous recombination, is contemplated. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more than 
two centromeres. 

10 As used herein, a formerly dicentric chromosome is a chromosome 

that is produced when a dicentric chromosome fragments and acquires new 
telomeres so that two chromosomes, each having one of the centromeres, 
are produced. Each of the fragments are replicable chromosomes. If one of 
the chromosomes undergoes amplification of primarily euchromatic DNA to 

1 5 produce a fully functional chromosome that is predominantly (more than 
about 50%, more than about 70% or more than about 90% euchromatin) 
euchromatin, it is a minichromosome. The remaining chromosome is a 
formerly dicentric chromosome. If one of the chromosomes undergoes 
amplification, whereby heterochromatin (such as, for example, satellite DNA) 

20 is amplified and a euchromatic portion (such as, for example, an arm) 

remains, it is referred to as a sausage chromosome. A chromosome that is 
substantially all heterochromatin, except for portions of heterologous DNA, is 
called a predominantly heterochromatic artificial chromosome. Predominantly 
heterochromatic artificial chromosomes can be produced from other partially 

25 heterochromatic artificial chromosomes by culturing the cell containing such 
chromosomes under conditions that destabilize the chromosome and/or under 
selective conditions so that a predominantly heterochromatic artificial 
chromosome is produced. For purposes herein, it is understood that the 
artificial chromosomes may not necessarily be produced in multiple steps, 

30 but may appear after the initial introduction of the heterologous DNA. 
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TypicalJy, artificial chromosomes appear after about 5 to about 60, or about 
5 to about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell divisions following introduction of nucleic acid into a cell. 
Artificial chromosomes may, however, appear after only about 5 to about 1 5 
5 or about 1 0 to about 1 5 cell divisions. 

As used herein, the term "satellite DNA-based artificial chromosome 
(SATAC)" is interchangable with the term "artificial chromosome expression 
system (ACes)". These artificial chromosomes (ACes) include those that are 
substantially all neutral non-coding sequences (heterochromatin) except for 

10 foreign heterologous, typically gene or protein-encoding, nucleic acid, that 
may be interspersed within the heterochromatin for the expression therein 
(see U.S. Patent Nos. 6,025,155 and 6,077,697 and International PCT 
application No. WO 97/40183), or that is in a single locus as provided 
herein. The delineating structural feature is the presence of repeating units, 

15 which are generally predominantly heterochromatin. The precise structure of 
the ACes will depend upon the structure of the chromosome in which the 
initial amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 

20 (heterologous genes) contained in these artificial chromosome expression 
systems can include any nucleic acid whose expression is of interest in a 
particular host cell. 

As used herein, an artificial chromosome that is predominantly 
heterochromatic {i.e., containing more heterochromatin than euchromatin, 

25 typically more than about 50%, more than about 60%, more than about 

70%, more than about 80% or more than about 90% heterochromatin) may 
be produced by introducing nucleic acid molecules into cells, particularly 
plant cells, and se)ecting cells that contain a predominantly heterochromatic 
artificial chromosome. Any nucleic acid may be introduced into cells in the 

30 methods of producing the artificial chromosomes. For example, the nucleic 
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acid may contain a selectable marker and/or a sequence that targets nucleic 
acid to a heterochromatic region of a chromosome, particularly a plant 
chromosome, such as in the pericentric heterochromatin, in the short arm of 
acrocentric chromosomes, rDNA or nucleolar organizing regions. Targeting 
5 sequences include, but are not limited to, lambda phage DNA and rDNA 

{e.g., a sequence of an intergenic spacer of rDNA), particularly plant rDNA, 
for production of predominantly heterochromatic artificial chromosomes in 
plant cells. 

After introducing the nucleic acid into cells, a cell containing a 

10 predominantly heterochromatic artificial chromosome is selected. Such cells 
may be Identified using a variety of procedures. For example, repeating units 
of heterochromatic DNA of these chromosomes may be discerned by G- 
and/or C-banding and/or fluorescence in situ hybridization (FISH) techniques. 
Prior to such analyses, the cells to be analyzed may be enriched with 

15 artificial chromosome-containing cells by sorting the cells on the basis of the 
presence of a selectable marker, such as a reporter protein, or by growing 
(culturing) the cells under selective conditions. Selection of cells containing 
amplified nucleic acids may also be facilitated by use of techniques such as 
PCR and Southern blotting to identify cell lines with amplified regions. It is 

20 also possible, after introduction of nucleic acids into cells, to select cells that 
have a multicentric, typically dicentric, chromosome, a formerly multicentric 
(typically dicentric) chromosome and/or various heterochromatic structures 
and to treat them such that desired artificial chromosomes are produced. 
Conditions for generation of a desired structure include, but are not limited 

25 to, further growth under selective conditions, introduction of additional 
nucleic acid molecules and/or growth under selective conditions and 
treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 
6,025,155 and 6,077,697). 
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As used herein, heterologous and foreign are used interchangeably 
with respect to nucleic acid and refer to any nucleic acid, including DNA and 
RNA, that does not occur naturally as part of the genome in which it is 
present or which is found in a location or locations in the genome that differ 
5 from that in which it occurs in nature. Thus, heterologous or foreign nucleic 
acid that is not normally found in the host genome in an identical context. It 
is nucleic acid that is not endogenous to the cell and has been exogenously 
introduced into the cell. Examples of heterologous DIVA include, but are not 
limited to, DNA that encodes a gene product or gene product(s) of interest, 

10 introduced for purposes of modification of the endogenous genes or for 

production of an encoded protein. For example, a heterologous or foreign 
gene may be isolated from a different species than that of the host genome, 
or alternatively, may be isolated from the host genome but operably linked to 
one or more regulatory regions which differ from those found in the 

15 unaltered, native gene. Other examples of heterologous DNA include, but 
are not limited to, DNA that encodes traceable marker proteins, and DNA 
that encodes a protein that confers an input trait including, but not limited to, 
herbicide, insect, or disease resistance or an output trait, including, but not 
limited to, oil quality or carbohydrate composition. Antibodies that are 

20 encoded by heterologous DNA may be secreted, sequestered, stored in an 
organ or tissue, accumulate in the cytoplasm or cellular organelles or 
expressed on the surface of the cell in which the heterologous DNA has been 
introduced. 

As used herein, a "selectable marker" is a composition that can be 
25 used to distinguish one cell from another cell. For example, a selectable 
marker may be a nucleic acid encoding a readily detected protein that has 
been introduced into some cells but not others. Detection of the expressed 
protein in cells facilitates identification of cells containing the marker nucleic 
acid by distinguishing them from cells that do not contain the nucleic acid. 
30 Thus, for example, a selectable marker may be a fluorescent protein, such as 
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green fluorescent protein (GFP), or fi-galactosidase lor a nucleic acid 
encoding either of these proteins). Setectable markers such as these, which 
are not required for cell survival and/or proliferation in the presence of a 
selection agent, may also be referred to as reporter molecules. Other 
5 selectable markers, e.g., the neomycin phosphotransferase gene, provide for 
isolation and identification of cells containing them by conferring properties 
on the cells that make them resistant to an agent, e.g. , a drug such as an 
antibiotic, that inhibits proliferation of cells that do not contain the marker. 

As used herein, growth under selective conditions means growth of a 
10 cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any agent 
known by those of skid in the art to enhance amplification events, and/or 
mutations. Such agents, which include BrdU, are well known to those of 

15 skill in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic acid 
of interest in the nucleic acid that is being introduced into cells to initiate 
production of the artificial chromosome. Thus, for example, a nucleic acid of 

20 interest could be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For example, the nucleic acid of interest can be 
linked to targeting nucleic acid(s). Alternatively, heterologous nucleic acid of 
interest can be introduced into an artificial chromosome at a later time after 

25 the initial generation of the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome derived 
from a multicentric, typically dicentric, chromosome that contains more 
euchromatic than heterochromatic DNA. For purposes herein, the 
minichromosome contains a de novo centromere, preferably a centromere 

30 that replicates in plants, more preferably a plant centromere. 
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As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 
herein. 

5 As used herein, in vitro assembled artificial chromosomes or synthetic 

chromosomes are artificial chromosomes produced by joining essential 
components of a chromosome in vitro. These components include at least a 
centromere, a telomere and an origin of replication. An in vitro assembled 
artificial chromosome may include one or more megareplicators. In particular 
10 embodiments, the megareplicator contains sequences of rDNA, particularly 
plant rDNA. 

As used herein, in vitro assembled plant artificial chromosomes are 
produced by joining components (e.g. , the centromere, telomere(s) 
megareplicator and an origin of replication) that function in plants, and 

15 preferably, one or more of which is derived from a plant, in vitro assembled 
artificial chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
chromosome may be substantially all heterochromatin, or may contain 
increasing amounts of euchromatic DNA, such that, for example, it contains 

20 about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
about 90% euchromatic DNA. in vitro assembled artificial chromosomes 
may contain one or more regions of segmentation as described with 
reference to amplification-based artificial chromosomes. 

As used herein, an artificial chromosome platform refers to an artificial 

25 chromosome that has been engineered to include one or more sites for site 
specific recombination-directed integration. Included within the artificial 
chromosome platforms are ACes, particularly plant ACes, that are so- 
engineered. Any sites, including but not limited to any described herein, that 
are suitable for such integration are contemplated. Among the ACes 

30 contemplated herein are those that are predominantly heterochromatic 
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(formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 
U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183), artificial chromosomes predominantly made 
up of repeating nucleic acid units and that contain substantially equivalent 
5 amounts of euchromatic and heterochromatic DNA or wherein the repeat 
regions of the chromosomes contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. Included among the ACes for 
use in generating platforms are artificial chromosomes that introduce and 
express heterologous nucleic acids in plants as described herein. These 

10 include artificial chromosomes that have a centromere derived from a plant, 
and, also, artificial chromosomes that have centromeres that may be derived 
from other organisms but that function in plants. 

As used herein, recognition sequences are particular sequences of 
nucleotides that a protein, DNA, or RNA molecule, or combinations thereof, 

15 (such as, but not limited to, a restriction endonuclease, a modification 
methylase and a recombinase) recognizes and binds. For example, a 
recognition sequence for Cre recombinase {see, e.g., SEQ ID No. 30) is a 34 
base pair sequence containing two 1 3 base pair inverted repeats (serving as 
the recombinase binding sites) flanking an 8 base pair core and designated 

20 loxP (see, e.g., Sauer (1994) Current Opinion In Biotechnology 5:521-527). 
Other examples of recognition sequences, include, but are not limited to, 
attB and attP, attR and ettL and others (see, e.g., SEQ ID Nos. 32-48), that 
are recognized by the recombinase enzyme fntegrase (see, SEQ ID Nos. 49 
and 50) for the nucleotide and encoded amino acid sequences of an 

25 exemplary lambda phage integrase). 

The recombination site designated attB is an approximately 33 base 
pair sequence containing two 9 base pair core-type Int binding sites and a 7 
base pair overlap region; attP (SEQ ID No. 48) is an approximately 240 base 
pair sequence containing core-type Int binding sites and arm-type Int binding 

30 sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, e.g., Landy 
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(1993) Current Opinion in Biotechnoiogy 3:699-707! see, e.g., SEQ ID Nos. 
32 and 48). 

As used herein, a recombinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
5 herein refers to a recombinase that is a member of the lambda (A) integrase 
family. 

As used herein, recombination proteins include excisive proteins, 
integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 

10 (see, Landy (1993) Current Opinion in Biotechnoiogy 3:699-707). 

As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP site 
is a 34 base pair nucleotide sequence from bacteriophage P1 (see, e.g., 

15 Hoess etal. (1982) Proc. Natl. Acad. Sci. U.S.A. 73:3398-3402). The LoxP 
site contains two 1 3 base pair inverted repeats separated by an 8 base pair 
spacer region as follows: (SEQ ID NO. 51): 

ATAACTTCGTATA ATGTATGC TATA C G A A GTT AT 
E. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

20 carrying two loxP sites connected with a LEU 2 gene are available from the 
American Type Culture Collection (ATCC) under accession numbers ATCC 
53254 and ATCC 20773, respectively. The lox sites can be isolated from 
plasmid pBS44 with restriction enzymes £coRI and Sai\, or Xho\ and BamH\. 
In addition, a preselected DNA segment can be inserted into pBS44 at either 

25 the Sal\ or BamHX restriction enzyme sites. Other lox sites include, but are 
not limited to, LoxB, LoxL, LoxC2 and LoxR sites, which are nucleotide 
sequences isolated from E. coii (see, e.g. , Hoess et af. (1982) Proc. Natl. 
Acad. Sci. U.S.A. 73:3398). Lox sites can also be produced by a variety of 
synthetic techniques (see, e.g. , Ito et al. (1982) Nuc. Acid Res. 70/1755 and 

30 Ogilvie etal. (1981) Science 270:270). 
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As used herein, the expression "ere gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One ere gene can be 
isolated from bacteriophage P1 (see, e.g., Abremski eta/. (1983) Cell 
5 32:1 301-131 1). E. coli DH1 and yeast strain BSY90 transformed with 
plasmid pBS39 carrying a ere gene isolated from bacteriophage P1 and a 
GAL1 regulatory nucleotide sequence are available from the American Type 
Culture Collection (ATCC) under accession numbers ATCC 53255 and ATCC 
20772, respectively. The ere gene can be isolated from plasmid pBS39 with 

10 restriction enzymes Xho\ and Sa/l. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single nucleic 
acid molecule or between two different molecules that requires the presence 
of an exogenous protein, such as an integrase or recombinase. 

15 For example, Cre-lox site-specific recombination can include the 

following three events: 

a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
20 DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to lox 
sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an integration 
event if one or both of the DNA molecules are circular. DNA segment refers 

25 to a linear fragment of single- or double-stranded deoxyribonucleic acid 
(DNA), which can be derived from any source. Since the lox site is an 
asymmetrical nucleotide sequence, two lox sites on the same DNA molecule 
can have the same or opposite orientations with respect to each other. 
Recombination between lox sites in the same orientation results in a deletion 

30 of the DNA segment located between the two lox sites and a connection 
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between the resulting ends of the original DNA molecule. The deleted DNA 
segment forms a circular molecule of DNA. The original DNA molecule and 
the resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule result 
5 in an inversion of the nucleotide sequence of the DNA segment located 
between the two lox sites. In addition, reciprocal exchange of DNA 
segments proximate to lox sites located on two different DNA molecules can 
occur. All of these recombination events are catalyzed by the gene product 
of the ere gene. Thus, the Cre-lox system can be used to specifically delete, 

10 invert, or insert DNA. The precise event is controlled by the orientation of 
lox DNA sequences, in c/s the lox sequences direct the Cre recombinase to 
either delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

15 insertion of a recombinant DNA. 

As used herein, a plant refers to an organism that is taxonomically 
classifed as being in the kingdom Plantae. Such organisms include 
eukaryotic organisms that contain chloroplasts capable of carrying out 
photosynthesis. A plant can be unicellular or multicellular and can contain 

20 multiple tissues and/or organs. Plants can reproduce sexually and/or 

asexually and include species that are perennial or annual in growth habit. A 
plants can be found to exist in a variety of habitats, including terrestrial and 
aquatic environments. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and other 

25 parts of a whole plant. 

As used herein, reproductive mode with reference to a plant refers to 
any and all methods by which a plant produces progeny. Reproductive 
modes include, but are not limited to, sexual and asexual reproduction. 
Plants may produce progeny by one or multiple reproductive modes. Sexual 

30 reproduction can include union of cells derived from haploid gametophytes 
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(e.g., eggs produced from ovules and sperm produced from pollen in seed 
plants) to form diploid zygotes. Zygotes may be formed from gametophytes 
from different plants or from gametophytes of the same plant (e.g., through 
self-fertilization). Asexual reproduction can occur when offspring are 
5 produced through modifications of the sexual life cycle that do not include 
meiosis and syngamy. For example, when vascular plants reproduce 
asexually, they may do so by vegetative reproduction, such as budding, 
branching, and tillering, or by producing spores or seed genetically identical 
to the sporophytes that produced them. 

10 As used herein, stable maintenance of chromosomes occurs when at 

least about 85%, preferably 90%, more preferably 95%, of the cells retain 
the chromosome. Stability is measured in the presence of a selective agent. 
Preferably these chromosomes are also maintained in the absence of a 
selective agent. Stable chromosomes also retain their structure during cell 

15 culturing, suffering no unintended intrachromosomal nor interchromosomal 
rearrangements. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis of 
proteins. Ribosomal RNA is produced by transcription of genes which, in 
eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences that 
encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) 

30 [see e.g., Rowe et af. (1996) Mamm. Genome 7:886-889 and Johnson et af. 
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(1993) Mamm. Genome 4:49-52]. In Arabidopsis thaliana the presence of 
rDNA has been verified on chromosomes 2 and 4 {18S, 5.8S, and 25S 
rDNA) and on chromosomes 3,4, and 5 (5S rDNA)[see The Arabidopsis 
Genome Initiative (2000) Nature 408:796-815]. In eukaryotic cells, the 
5 multiple copies of the highly conserved rRNA genes are located In a tandemly 
arranged series of rDNA units, which are generally about 40-45 kb in length 
and contain a transcribed region and a nontranscribed region known as 
spacer (i.e., intergenic spacer) DNA which can vary in length and sequence. 
In the human and mouse, these tandem arrays of rDNA units are located 

lO adjacent to the pericentric satellite DNA sequences (heterochromatin). The 
regions of these chromosomes in which the rDNA is located are referred to 
as nucleolar organizing regions (NOR) which loop into the nucleolus, the site 
of ribosome production within the cell nucleus. In higher plants, the rDNA is 
arragened in long tandem repeating units, similar to those of other higher 

15 eukaroytes. The 18S, 5.8S and 25S rRNA genes are clustered and are 
transcribed as one unit, while the 5S genes are located elsewhere in the 
genome. Between the 3' end of the 25S gene and the 5' end of the 1 8S 
gene is located a DNA spacer that ranges from 1 kb to greater than 1 2 kb in 
length for different species. Therefore, the rDNA repeat ranges from about 4 

20 kb to about 1 5 kb for different plant species [see, e.g., Rogers and Bendich 
(1987) Plant MoL Biol. 3:509-520]. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 

25 amplicons that contain two inverted megareplicons bordered by introduced 
heterologous DNA [see, e.g., Figure 3 of U.S. Patent No. 6,077,697 for a 
schematic drawing of a megachromosome]. For purposes herein, a 
megachromosome is about 50 to 400 Mb, generally about 250-400 Mb. 
Shorter variants are also referred to as truncated megachromosomes [about 

30 90 to 120 or 150 Mb], dwarf megachromosomes [—150-200 Mb] and cell 
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lines, and a micro-megachromosome [—50-90 Mb, typically 50-60 Mb]. For 
purposes herein, the term megachromosome refers to the overall repeated 
structure based on an array of repeated chromosomal segments (amplicons) 
that contain two inverted megareplicons bordered by any inserted 
5 heterologous DNA. 

As used herein, transformation and transfection are used 
interchangeably to refer to the process of introducing nucleic acid 
introduced into cells. The terms transfection and transformation refer to the 
taking up of exogenous nucleic acid, e.g. , an expression vector, by a host 

lO cell whether or not any coding sequences are in fact expressed. Numerous 
methods of introducing nucleic acids into cells are known to the ordinarily 
skilled artisan, for example, by Agrobacterium-med\ated transformation, 
protoplast transfection (including polyethylene glycol (PEG)-mediated 
transfection, electroporation, protoplast fusion, and microcell fusion), lipid- 

15 mediated delivery, liposomes, electroporation, microinjection, particle 

bombardment and silicon carbide whisker-mediated transformation (see, e.g., 
Paszkowski et al. (1984) EMBO J. 3:271 7-2722; Potrykus et a/. (1985) MoL 
Gen. Genet. 199: 169-1 77; Reich et al. (1986) Biotechnology 4:1001-1004; 
Klein et al. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 

20 Paszkowski et al. (1989) in Cell Culture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et al. 
(1994) Plant J. 6:941-948), direct uptake using calcium phosphate [CaP04; 
see,e.g., Wigler et al. (1979) Proc. Natl. Acad. Sci. U.S.A. 76:1373-1376], 

25 polyethylene glycol [PEG]-mediated DNA uptake, lipofection [see, e.g., 

Strauss (1996) Meth. MoL Biol. 54:307-327], microcell fusion [see Lambert 
(1991) Proc. Natl. Acad. Sci. U.S.A. 66:5907-5911; U.S. Patent No. 
5,396,767, Sawford et al. (1987) Somatic Cell MoL Genet. 76:279-284; 
Dhar et al. (1984) Somatic Cell MoL Genet. 70:547-559; and McNeitl-Killary 

30 et al. (1995) Meth. Enzymol. 264:133-152], lipid-mediated carrier systems 
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tsee, e.g., Teifel era/. (1995) Biotechniques 19:79-80; Albrecht eta/. (1996) 
Ann. Hematol. 72:73-79; Holmen eta/. (1995) /n V/tro Cell Dev. Bio/. An/rn. 
37:347-351; Remy et al. (1994) Bioconjug. Chem. 5:647-654; Le Bolch et 
a/. (1995) Tetrahedron Lett. 35:6681-6684; Loeffler et aL (1993) Meth. 
5 Enzymol. 2 7 7:599-618] or other suitable method. Successful transfection is 
generally recognized by detection of the presence of the heterologous nucleic 
acid within the transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector within 
the host cell. 

lO As used herein, injected refers to the microinjection (use of a small 

syringe, needle, or pipette) of nucleic acid into a celL 

As used herein, gene therapy involves the transfer or insertion of 
nucleic acid molecules into certain cells, which are also referred to as target 
cells, to produce products that are involved in preventing, curing, correcting, 

15 controlling or modulating diseases, disorders and/or deleterious conditions. 
The nucleic acid is introduced into the selected target cells in a manner such 
that the nucleic acid is expressed and a product encoded thereby is 
produced. Alternatively, the nucleic acid may in some manner mediate 
expression of DNA that encodes a therapeutic product. This product may be 

20 a therapeutic compound, which is produced in therapeutically effective 

amounts or at a therapeutically useful time. It may also encode a product, 
such as a peptide or RNA, that in some manner mediates, directly or 
indirectly, expression of a therapeutic product. Expression of the nucleic 
acid by the target cells within an organism afflicted with a disease or 

25 disorder thereby enables modulation of the disease or disorder. The nucleic 
acid encoding the therapeutic product may be modified prior to introduction 
into the cells of the afflicted host in order to enhance or otherwise alter the 
product or expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed by 

30 introduction of the transfected cells into an organism. This is often referred 
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to as ex vivo gene therapy. Alternatively, the cells can be transfected 
directly in vivo within an organism. 

As used herein, a therapeutically effective product is a product that 
effectively ameliorates or eliminates the symptoms or manifestations of an 
5 inherited or acquired disease or disorder or that cures said disease or disorder 
in an organism. For example, therapeutically effective products include a 
product that is encoded by heterologous DNA expressed in a diseased 
organism and a product produced from heterologous DNA in a host cell and 
to which a diseased organism is exposed. 

10 As used herein, a transgenic plant refers to a plant (e.g., a plant cell, 

tissue, organ or whole plant) containing heterologous or foreign nucleic acid 
or in which the expression of a gene naturally present in the plant has been 
altered. Heterologous nucleic acid within a transgenic plant may be 
transiently or stably maintained within the plant. Stable maintenance of 

15 heterologous nucleic acid may be maintenance of the nucleic acid through 
one or more, or two or more, or five or more, or ten or more, or 25 or more, 
or 50 or more or 60 or more cell divisions. A transgenic plant may contain 
heterologous nucleic acid in one cell, multiple cells or all cells. A transgenic 
plant may produce progeny that contain or do not contain the heterologous 

20 nucleic acid. 

As used herein, a promoter, with respect to a region of DNA, refers to 
a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of messenger 
RNA (mRNA) from a template strand of the DNA. A promoter thus generally 

25 regulates transcription of DNA into mRNA. 

As used herein, operative linkage of heterologous DNA to regulatory 
and effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences refers 
to the relationship between such DNA and such sequences of nucleotides. 

30 For example, operative linkage of heterologous DNA to a promoter refers to 
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the physical relationship between the DNA and the promoter such that the 
transcription of such DNA is initiated from the promoter by an RNA 
polymerase that specifically recognizes, binds to and transcribes the DNA in 
reading frame. 

5 As used herein, isolated, substantially pure nucleic acid, such as, for 

example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that found 
in Maniatis et ah [(1982) Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY]. 

lO As used herein, expression refers to the transcription and/or 

translation of nucleic acid. For example, expression can be the transcription 
of a gene into an RNA molecule, such as a messenger RNA (mRNA) 
molecule. Expression may further include translation of an RNA molecule 
into peptides, polypeptides, or proteins. If the nucleic acid is derived from 

15 genomic DNA, expression may, if an appropriate eukaryotic host cell or 
organism is selected, include splicing of the mRNA. With respect to an 
antisense construct, expression may refer to the transcription of the 
antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that are 
20 used to Introduce heterologous nucleic acids into cells for either expression 
of the heterologous nucleic acid or for replication of the heterologous nucleic 
acid. Selection and use of such vectors and plasmids are well within the 
level of skill of the art. 

As used herein, substantially homologous DNA refers to DNA that 
25 includes a sequence of nucleotides that is sufficiently similar to another such 
sequence to form stable hybrids under specified conditions. 

It is well known to those of skill in this art that nucleic acid fragments 
with different sequences may, under the same conditions, hybridize 
detectably to the same "target" nucleic acid. Two nucleic acid fragments 
30 hybridize detectably, under stringent conditions over a sufficiently long 
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hybridization period, because one fragment contains a segment of at least 
about 14 nucleotides in a sequence which is complementary (or nearly 
complementary) to the sequence of at least one segment in the other nucleic 
acid fragment. If the time during which hybridization is allowed to occur is 
5 held constant, at a value during which, under preselected stringency 

conditions, two nucleic acid fragments with exactly complementary base- 
pairing segments hybridize detectably to each other, departures from exact 
complementarity can be introduced into the base-pairing segments, and base- 
pairing will nonetheless occur to an extent sufficient to make hybridization 

10 detectable. As the departure from complementarity between the base-pairing 
segments of two nucleic acids becomes larger, and as conditions of the 
hybridization become more stringent, the probability decreases that the two 
segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 

15 same sequence," within the meaning of the present specification, if (a) both 
form a base-paired duplex with the same segment, and (b) the melting 
temperatures of said two duplexes in a solution of 0.5 X SSPE differ by less 
than 10oC. If the segments being compared have the same number of 
bases, then to have "substantially the same sequence", they will typically 

20 differ in their sequences at fewer than 1 base in 1 0. Methods for determining 
melting temperatures of nucleic acid duplexes are well known [see, e.g. . 
Meinkoth and Wahl (1984) Anal. Biochem . 138 :267-284 and references 
cited therein]. 

As used herein, a nucleic acid probe is a DNA or RNA fragment that 
25 includes a sufficient number of nucleotides to specifically hybridize to DNA or 
RNA that includes identical or closely related sequences of nucleotides. A 
probe may contain any number of nucleotides, from as few as about 10 and 
as many as hundreds of thousands of nucleotides. The conditions and 
protocols for such hybridization reactions are well known to those of skill in 
30 the art as are the effects of probe size, temperature, degree of mismatch. 
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salt concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 
5 To be used as a hybridization probe, the nucleic acid is generally 

rendered detectable by labelling it with a detectable moiety or label, such as 
32 P, 3 H and 14 C, or by other means, including chemical labelling, such as by 
nick-translation in the presence of deoxyuridylate biotinylated at the 5'- 
position of the uracil moiety. The resulting probe includes the biotinylated 

lO uridylate in place of thymidylate residues and can be detected (via the biotin 
moieties) by any of a number of commercially available detection systems 
based on binding of streptavidin to the biotin. Such commercially available 
detection systems can be obtained, for example, from Enzo Biochemicals, 
Inc. (New York, NY). Any other label known to those of skill in the art, 

15 including non-radioactive labels, may be used as long as it renders the probes 
sufficiently detectable, which is a function of the sensitivity of the assay, the 
time available (for culturing cells, extracting DNA, and hybridization assays), 
the quantity of DNA or RNA available as a source of the probe, the particular 
label and the means used to detect the label. 

20 Once sequences with a sufficiently high degree of homology to the 

probe are identified, they can readily be isolated by standard techniques, 
which are described, for example, by Maniatis et al. [(1982) Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NYJ. 

25 As used herein, conditions under which DNA molecules form stable 

hybrids and are considered substantially homologous are such that DNA 
molecules with at least about 60% complementarity form stable hybrids. 
Such DNA fragments are herein considered to be "substantially 
homologous". For example, DNA that encodes a particular protein is 

30 substantially homologous to another DNA fragment if the DNA forms stable 
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hybrids such that the sequences of the fragments are at least about 60% 
complementary and if a protein encoded by the DNA retains its activity. 

For purposes herein, the following stringency conditions are defined: 
1) high stringency: 0.1 x SSPE, 0.1% SDS, 65°C 
5 2) medium stringency: 0.2 x SSPE, 0.1 % SDS, 50°C 

3) low stringency: 1 .0 x SSPE, 0.1% SDS, 50°C 
or any combination of salt and temperature and other reagents that result in 
selection of the same degree of mismatch or matching. 

As used herein, all assays and procedures, such as hybridization 
10 reactions and antibody-antigen reactions, unless otherwise specified, are 
conducted under conditions recognized by those of skill in the art as 
standard conditions. 

A. Amplification of Chromosomal Segments and Use Thereof In the 
Generation of Artificial Chromosomes 

15 The methods, cells and artificial chromosomes provided herein are 

produced by virtue of the discovery of the existence of a higher-order 
replication unit (megarepUcon) of the centromeric region, including the 
pericentric DNA, of a chromosome. This megareplicon is delimited by a 
primary replication initiation site (megareplicator), and appears to facilitate 

20 replication of the centromeric heterochromatin, and, most likely, 

centromeres. Integration of heterologous nucleic acid into the megareplicator 
region, or in close proximity thereto, initiates a targe-scale amplification of 
megabase-size chromosomal segments. Products of such amplification may 
be used as artificial chromosomes or in the generation of artificial 

25 chromosomes as described herein. 

Included among the DNA sequences that may provide a 
megareplicator are the rDNA units that give rise to ribosomal RNA (rRNA). In 
plants and animals, particularly mammals such as mice and humans, these 
rDNA units can contain specialized elements, such as the origin of replication 

30 (or origin of bidirectional replication, i.e., OBR, in mouse) and amplification 
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promoting sequences (APS) and amplification control elements (ACE) [see, 
e.g., with respect to plant rDNA, U.S. Patent Nos. 6,096,546 (to Raskin) and 
6,100,092 (to Borysyuk eta/.); PCT International Application Publication No. 
W099/66058; Genbank Accession no. Y08422 (containing the central AT- 
5 rich region of a tobacco rDNA intergenic spacer); Borysyuk et al. (1997) 
Plant MoL Bio/. 35:655-660); Borysyuk et aL. (2000) Nature B/otechno/ogy 
75:1303-1306; Hernandez eta/. (1993) EMBO J. 72:1475-1485; Van't Hot 
and Lamm (1992) Plant MoL Biol. 20:377-382; Hernandez et al. (1988) Plant 
Mo/. BioL 70:413-322; and with respect to mammalian rDNA, Gogel et al. 

10 (1996) Chromosoma 704:511-518; Coffman et al. (1993) Exp. Cell. Res. 

205:123-132; Little et at. (1993) MoL Cell. BioL 73:6600-6613; Yoon et al. 
(1995) MoL Cell. BioL 75:2482-2489; Gonzalez and Sylvester (1995) 
Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res. 
70:3933-3949; Maden et al. (1987) Biochem. J. 246:519-527]. 

15 As described herein, without being bound by any theory, specialized 

elements such as these may facilitate replication and/or amplification of 
megabase-size chromosomal segments in the de novo formation of 
chromosomes, such as the artificial chromosomes described herein, in cells. 
These specialized elements are typically located in the nontranscribed 

20 intergenic spacer region upstream of the transcribed region of rDNA. The 
intergenic spacer region may itself contain internally repeated sequences 
which can be classified as tandemly repeated blocks and nontandem blocks 
(see e.g., Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse 
rDNA, an origin of bidirectional replication may be found within a 3-kb 

25 initiation zone centered approximately 1.6 kb upstream of the transcription 
start site (see, e.g., Gogel et al. (1996) Chromosoma 704:511-518). The 
sequences of these specialized elements tend to have an altered chromatin 
structure, which may be detected, for example, by nuclease hypersensitivity 
or the presence of AT-rich regions that can give rise to bent DNA structures. 

30 
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Sequences of intergenic spacer regions of plant rDNA include, but are 
not limited to, sequences contained in GenBank Accession numbers S70723 
(from the 5S rDNA of barley (Hordeum vufgare)), AF013103 and X03989 
(from maize {Zea mays)), X65489 (from potato (Sofanum tuberosum)) , 
5 X52265 (from tomato (Lycopers/con esculentum)) , AF177418 (from 

Arabidopsis neglecta), AF1 77421 and AF17422 (from Arabidopsis hafleri), 
A71562, X15550 f and X52631 (from Arabidopsis thaiiana; see Gruendler et 
al. (1991) J. MoL Biol. 227:1209-1222 and Gruendler et ah (1989) Nucleic 
Acids Res. 1 7:6395-6396), X54194 (from rice (Oryza sativa)) and Y08422 

10 and D76443 (from tobacco (Nicotiana tabacum). Sequences of intergenic 

spacer regions of plant rDNA further include sequences from rye (see Appels 
et al. (1986) Can. J. Genet. Cytol. 25:673-685), wheat (see Barker era/. 
(1988) J. MoL Biol. 207:1-17 and Sardana and Flavell (1996) Genome 
55:288-292), radish (see Delcasso-Tremousaygue et al. (1988) Eur. J. 

15 Biochem. 7 72:767-776), Vicia faba and Pisum sativum (see Kato et al. 

(1990) Plant Mol. Biol. 74:983-993), mung bean (see Gerstner et al. (1988) 
Genome 30:723-733; and Schiebel et al. (1989) Mol. Gen. Genet. 2 75:302- 
307), tomato (see Schmidt-Puchta et al. (1989) Plant MoL Biol. 73:251- 
253), Hordeum bulbosum (see Procunier et al. (1990) Plant Mol. Biol. 

20 75:661-663) and Lens culinaris Medik., and other legume species (see 
Fernandez et al. (2000) Genome 43:597-603). Nucleic acids containing 
intergenic spacer sequences from plants can be obtained by nucleic acid 
amplification of DNA from p)ant cells using oligonucleotide primers 
corresponding to the 3' end of the conserved 25S mature rRNA encoding 

25 region and the 5' end of the conserved 1 8S mature rRNA encoding region 
{see e.g., PCT Application Publication No. WO98/13505). 

An exemplary sequence encompassing a mammalian origin of 
replication is provided in GENBANK accession no. X82564 at about positions 
2430-5435. Exemplary sequences encompassing mammalian amplification- 

30 promoting sequences include nucleotides 690-1060 and 1 105-1530 of 
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GENBANK accession no. X82564 and are also provided in PCT Application 
Publication No. WO 97/40183. Exemplary sequences encompassing plant 
amplification-promoting sequences (APS) include those provided in U.S. 
Patent No. 6,100,092. 
5 In human rDNA, a primary replication initiation site may be found a 

few kilobase pairs upstream of the transcribed region and secondary initiation 
sites may be found throughout the nontranscribed intergenic spacer region 
(see, e.g., Yoon et aL (1995) MoL Cell. Biol. 75:2482-2489). A complete 
human rDNA repeat unit is presented in GENBANK as accession no. U13369. 

10 Another exemplary sequence encompassing a replication initiation site may 
be found within the sequence of nucleotides 35355-42486 in GENBANK 
accession no. U 13369 particularly within the sequence of nucleotides 
37912-42486 and more particularly within the sequence of nucleotides 
37912-39288 of GENBANK accession no. U13369 (see Coffman et aL 

15 (1993) Fxp.Ce//. Res. 209:123-132). 

B. Preparation of Plant Artificial Chromosomes 

Cell lines containing artificial chromosomes can be prepared by 
transforming cells, preferably a stable cell line, with heterologous nucleic acid 
and identifying cells that contain an artificial chromosome as described 

20 herein. The artificial chromosome is a chromosomal structure that is distinct 
from any chromosome that existed in the cell prior to introduction of the 
heterologous nucleic acid. A cell containing an artificial chromosome may be 
identified using a variety of procedures, alone or in combination, as described 
in detail herein. In particular embodiments of the methods described herein, 

25 the heterologous nucleic acid contains a sequence that targets the nucleic 
acid to an ampiifiable region of a chromosome in the cell, such as, for 
example, the pericentric heterochromatin and/or rDNA. A variety of targeting 
sequences are provided herein. 

Prior to analyzing transformed cells for the presence of an artificial 

30 chromosome, the cells to be analyzed may be enriched with artificial 
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chromosome-containing cells using a variety of techniques depending on the 
heterologous nucleic acid that was introduced into the host cell to initiate 
generation of the artificial chromosomes. For example, if nucleic acid 
encoding a selectable marker was included in the heterologous nucleic acid, 
5 cells containing the marker may be selected for analysis. If the selectable 
marker is one that confers resistance to a cytotoxic agent, e.g., bialaphos, 
hygromycin or kanamycin, the transformed cells may be cultured under 
selective conditions which include the agent. Cells surviving growth under 
selective conditions are then analyzed for the presence of artificial 

10 chromosomes. If the selectable marker is a readily detectable reporter 

molecule, such as, for example, a fluorescent protein, the transformed cells 
may be selected on the basis of fluorescent properties. For example, cells 
containing the fluorescent protein may be isolated from nontransformed cells 
using a fluorescence-activated cell sorter (FACS). 

15 In analyzing transformed cells for the presence of artificial 

chromosomes, it is also possible to identify cells that have a multicentric, 
typically dicentric, chromosome, formerly multicentric (typically dicentric) 
chromosome, minichromosome and/or heterochromatic structures, such as a 
megachromosome and a sausage chromosome. If cells containing 

20 multicentric chromosomes or formerly mulitcentric (typically formerly 
dicentric) chromosomes are initially selected, these cells can then be 
manipulated, if need be, as described herein to produce the 
minichromosomes and other artificial chromosomes, particularly the 
heterochromatic artificial chromosomes and other segmented, repeat region- 

25 containing artificial chromosomes, as described herein. 

1 . Cells used in the generation of plant artificial chromosomes 

Any cells harboring plant centromere-containing chromosomes may be 
used in the generation of plant artificial chromosomes (PACs). Such cells 
30 include, but are not limited to, plant cells, protoplasts, and cells that are 
hybrid cells of one or more plant species. Preferred cells are those that 
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harbor plant centromere-containing chromosomes and are readily susceptible 
to the introduction of heterologous nucleic acids therein. 

Cells for use in the generation of plant artificial chromosomes include 
cells that harbor acrocentric plant chromosomes. Examples of acrocentric 
5 plant chromosomes include chromosomes 2 and 4 of the plant Arabidopsis 
thaJiana (see, e.g., Mayer era/. (1999) Nature 402:769-777; Murata et a/. 
(1997) The Plant Journal 72:31-37; The Arabidopsis Genome Initiative 
(2OO0) Nature 405:796-815), four acrocentric chromosome pairs in 
Helianthus annuus (sunflower; see Schrader et a I. (1997) Chromosome Res. 

10 5:451-456), two pairs of acrocentric chromosomes in domesticated pepper 
plant {Capsicum annuum) and a nearly acrocentric chromosome in lentil 
plant. In particular embodiments of the methods described herein, cells 
harboring acrocentric plant chromosomes containing rDNA are used in 
generating plant artificial chromosomes. 

15 Plant species from which cells may be obtained include, but are not 

limited to, vegetable crops, fruit and vine crops, field plants, bedding plants, 
trees, shrubs, and other nursery stock. Examples of vegetable crops include 
artichokes, kohlrabi, arugula, leeks, asparagus, lettuce, bok choy, malanga, 
broccoli, melons (e.g., muskmelon, watermelon, crenshaw, honeydew, 

20 cantaloupe), brussel sprouts, cabbage, cardoni, carots, napa, cauliflower, 

okra, onions, celery, parsley, chick peas, parsnips, chicory, Chinese cabbage, 
peppers, collards, potatoes, cucumber plants, pumpkins, cucurbits, radishes, 
dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, garlic, 
spinach, green onions, squash, greens, beet, sweet potatoes, swiss chard, 

25 horseradish, tomatoes, kale, turnips and spices. Fruit and vine crops include 
apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince, 
almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, 
boysenberries, cranberries, currants, loganberries, raspberries, strawberries, 
blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegrante, 

30 pineapple, tropical fruits, pomes, melon, mango, papaya and lychee. 
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Field crop plants include evening primrose, meadow foam, corn, 

maize, hops, jojoba, peanuts, rice, safflower, small grains (barley, oats, rye, 

wheat, and others) sorghum, tobacco, kapok, leguminous plants (beans, 

lentils, peas, soybeans), oil plants (canola, rape, mustard, poppy, olives, 

5 sunflowers, coconut, castor oil plants, cocoa beans, groundnuts), fibre plants 

(cotton, flax, hemp, jute), lauraceae (cinnamon, camphor) and plants such as 

coffee, sugarcane, tea and natural rubber plants. Other examples of plants 

include bedding plants such as flowers, cactus, succulents and ornamental 

plants, as well as trees such as forest (broad-leaved trees and evergreens, 

10 such as conifers), fruit, ornamental and nut-bearing trees, shrubs, algae, 

moss, and duckweed. 

2. Heterologous nucleic acids for use in generating plant artificial 
chromosomes 

a. Selectable markers 

1 5 The heterologous nucleic acid that is introduced into a cell in the 

generation of artificial chromosomes as described herein may include nucleic 
acid encoding a selectable marker. Any nucleic acid that includes a 
selectable marker sequence may be introduced into cells harboring plant 
centromere-containing chromosomes for the generation of plant artificial 

20 chromosomes. Examples of selectable markers include, but are not limited 
to, DNA encoding a product that confers resistance to a cytotoxic or 
cytostatic agent and DNA encoding a readily detectable product, such as a 
reporter protein. 

(1) Nucleic acids encoding products that confer 
25 resistance to a selection agent 

Examples of selectable markers include the dihydrylfolate reductase 

(dhfr) gene, hygromycin phosphotransferase genes, the phosphinothricin 

acetyl transferase gene (bar gene) and neomycin phosphotransferase genes. 

Selectable markers that can be used in animal, e.g., mammalian cells include, 

30 but are not limited to the thymidine kinase gene and the cellular adenine- 

phosphribosyltransferase gene. 
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Of particular interest for purposes herein are nucleic acid selectable 
markers that, upon expression in the host cell, confer antibiotic or herbicide 
resistance to the cell, sufficient to provide for the maintenance of 
heterologous nucleic acids in the cell, and which facilitate the transfer of 
5 artificial chromosomes containing the marker DNA into new host cells. 
Examples of such markers include DNA encoding products that confer 
cellular resistance to hygromycin, kanamycin, G41 8, bialaphos, Basta, 
roethotrexate, glyphosate, and puromycin. For example, neo {or nptll) 
provides kanamycin resistance and can be selected for using kanamycin, 

lO G418, paromomycin and other agents [see, e.g., Messing and Vierra (1982) 
Gene 73:259-268; and Bevan eta/. (1983) Nature 304:184-187]; bar from 
Steptomyces hygroscopicus, which encodes the enzyme phosphinothricin 
acetyl transferase (PAT) confers bialaphos, glufosinate, Basta or 
phosphinothricin resistance [see e.g.. White et al. (1990) Nuc. Acids Res. 

15 /#:1062; Spencer et al. (1990) Theor. AppL Genet. 73:625-631; Vickers et 
al. (1996) Plant Mot. Biol. Reporter 74:363-368; and Thompson et al. (1987) 
EMBO J. 6:2519-2523]; the hph gene which confers resistance to the 
antibiotic hygromycin (see, e.g., Blochinger and Diggelmann, MoL Cell. Biol. 
4:2929-2931); a mutant EPSP synthase protein [see Hinchee et al. (1988) 

20 Bio/technof 6:915-922] confers glyphosate resistance (see also U.S. Patent 
Nos. 4,940,935 and 5,188,642); and a nitrilase such as bxn from Klebsiella 
ozaenae confers resistance to bromoxynil [see Stalker et al. (1988) Science 
242:419-42]. DNA encoding cystathionine gamma-synthase (CGS) can be 
used as a marker that confers resistance to ethionine (see PCT Application 

25 Publication No. WO 00/55303). Examples of markers that can be used in 
animal, e.g., mammalian cells, include but are not limited to DNA encoding 
products that confer cellular resistance to streptomycin, zeocin, 
chloramphenicol and tetracycline. 

(2) Reporter Molecules 
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Nucleic acids encoding reporter molecules may also be included in the 
nucleic acid that is introduced into a recipient cell in the generation of 
artificial chromosomes. Reporter genes provide a means for identifying cells 
and chromosomes into which heterologous nucleic acids have been 
5 transferred and further provide a means for assessing whether or not, and to 
what extent, transferred DNA is expressed. 

Nucleic acids encoding reporter molecules that may be used in 
monitoring transfer and expression of heterologous nucleic acids into cells, 
particularly plant cells include, but are not limited to, nucleic acid encoding 0- 

10 glucuronidase (GUS) or the uidA gene product, which is an enzyme for which 
various chromogenic substrates are known [see Novel and Novel (1973) Mol. 
Gen. Genet. 720:319-335; Jefferson et al. (1986) Proc. Natl. Acad. Sci. 
USA 53:8447-8451; US Patent No. 5,268,463; commercially available from 
Clontech Laboratories, Palo Alto, CA], DNA from an R-locus gene, which 

15 encodes a product that regulates the production of anthocyanin pigments 
(red color) in plant tissues [see, e.g., Dellaporta et at. (1988) In 
"Chromosome Structure and Function: Impact of New Concepts, 18th 
Stadler Genetics Sympsium" 7 7:263-282], nucleic acid encoding /^-lactamase 
[Sutcliffe (1978) Proc. Natl. Acad. Sci. U.S.A. 75:3737-3741] which is an 

20 enzyme for which various chromogenic substrates are known {e.g., PADAC, 
a chromogenic cephalosporin), DNA from a xy/E gene [see, e.g., Zukowsky 
etal. (1983) Proc. Natl. Acad. Sci. U.S.A. 50:1101-1105], which encodes a 
catechol dioxygenase that can convert chromogenic catechols; nucleic acid 
encoding a-amyfase [see, e.g., (kuta etal. (1990) Bio/technol. 8:241-242], 

25 nucleic acid encoding tyrosinase [see, e.g., Katz et al. (1983) J. Gen. 

Microbiol. 725:2703-2714], an enzyme capable of oxidizing tyrosine to 
DOPA and dopaquinone which in turn condenses to form the readily 
detectable compound melanin, nucleic acid encoding 0~galactosidase, an 
enzyme for which there are chromogenic substrates, nucleic acid encoding 

30 lucif erase {lux) gene [see, e.g., Ow et al. (1986) Science 234:856-859] 
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which allows for bioluminesence detection, nucleic acid encoding aequorin 
[see, e.g., Prasher eta/. (1985) Biochem. Biophy. Res. Commun. 726:1259- 
1 268] which may be employed in calcium-sensitive bioluminescence 
detection, nucleic acid encoding a green fluorescent protein (GFP) [see, e.g., 
5 Sheen et al. (1995) Plant J. 5:777-784; Haselhoff eta/. (1997) Proc. Natl. 
Acad. ScL U.S.A. 34:2122-2127; Hasseloff and Amos (1995) Trends Genet 
1 7:328-329; Reichel et al. (1996) Proc. Natl. Acad. Set. U.S.A. 93:5838- 
5893; Tian et al. (1997) Plant Cell Rep. 76:267-271; Prasher et al. (1992) 
Gene 7 7 7:229-233; Chalfie et al. (1994) Science 263:302; PCT Application 

10 Publication Nos. W097/41228 and WO 95/07463; and commercially 

available from CJontech Laboratoreis, Pafo Alto, CA), nucleic acid encoding a 
red or blue fluorescent protein (RFP or BFP, respectively), or nucleic acid 
encoding chloramphenicol acetyltransf erase (CAT). 

Enhanced GFP (EGFP) is a mutant of GFP with a 35-fold increase in 

15 fluorescence. This variant has mutations of Ser to Thr at amino acid 65 and 
Phe to Leu at position 64 and is encoded by a gene with optimized human 
codons (see, e.g., U.S. Patent No. 6,054,312). EGFP is a red-shifted variant 
of wild-type GFP (Yang et al. (1996) Nucl. Acids Res. 24:4592-4593; Haas 
et al. (1996) Curr. Biol. 6:31 5-324; Jackson et at. (1990) Trends Biochem. 

20 75:477-483) that has been optimized for brighter fluorescence and higher 
expression in mammalian cells (excitation maximum = 488 nm; emission 
maximum = 507 nm). EGFP encodes the GFPmutl variant (Jackson (1990) 
Trends Biochem. 76:477-483) which contains the double-amino-acid 
substitution of Phe-64 to Leu and Ser-65 to Thr. Sequences flanking EGFP 

25 have been converted to a Kozak consensus translation initiation site (Huang 
et al. (1990) Nucleic Acids Res. 18: 937-947) to further increase the 
translation efficiency in eukaryotic cells. 

Nucleic acid from the maize R gene complex can also be used as 
nucleic acid encoding a reporter molecule. The R gene complex in maize 

30 encodes a protein that acts to regulate the production of anthocyanin 
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pigments in most seed and plant tissue. Maize strains can have one, or as 

many as four, R alleles which combine to regulate pigmentation in a 

developmental and tissue-specific manner. Thus, an R gene introduced into 

such cells will cause the expression of a red pigment and, if stably 

5 incorporated, can be visually scored as a red sector. If a maize line carries 

dominant alleles for genes encoding for the enzymatic intermediates in the 

anthocyanin biosynthetic pathway (C2, A1, A2, Bz1 and Bz2), but carries a 

recessive allele at the R locus, the transformation of any cell from that line 

with R will result in red pigment formation. Exemplary lines include 

10 Wisconsin 22 which contains the rg-Stadler allele and TR1 1 2, a K55 

derivative which is r-g, b, PI. Alternatively, any genotype of maize can be 

utilized if the C1 and R alleles are introduced together. 

b. Promoters and other sequences that influence gene 
expression 

15 Expression of nucleic acid encoding a selectable marker (or any 

heterologous nucleic acid) in a recipient ceil can be regulated by a variety of 
promoters. Promoters for use in regulating transcription of DNA in cells, 
particularly plant cells, include, but are not limited to, the nopaline synthase 
(NOS) and octopine synthase (OCS) promoters; cauliflower mosaic virus 

20 (CaMV) 19S and 35S promoters, the light-inducible promoter from the small 
subunit of ribulose bis-phosphate carboxylase (ssRUBISCO, an abundant 
plant polypeptide), the mannopine synthase (MAS) promoter [see, e.g., 
Velten et al. (1984) EMBO J. 5:2723-2730; and Velten and Schell (1985) 
Nuc. Acids Res. 75:6981-6998], the rice actin promoter, the ubiquitin 

25 promoter, for example, from Z. mays (see e.g., PCT Application Publication 
No. WOOO/60061), Arabidopsis thaliana UBI 3 promoter [see e.g., Norris et 
aL (1993) Plant Mol. Biol. 22:895-906] and the chemically inducible PR-1 
promoter from tobacco or Arabidopsis (see e.g., U.S. Patent No. 5,689,044). 
Selection of a suitable promoter may include several considerations, 

30 for example, recipient cell type (such as, for example, leaf epidermal cells, 
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mesophyll cells, root cortex cells), tissue- or organ-specific (e.g., roots, 
leaves or flowers) expression of genes linked to the promoter, and timing and 
level of expression (as may be influenced by constitutive vs. regulatable 
promoters and promoter strength). 
5 Additional sequences that may also be included in the nucleic acid 

containing a selectable marker include, but are not restricted to, transcription 
terminators and extraneous sequences to enhance expression such as 
introns. A variety of transcription terminators may be used which are 
responsible for termination of transcription beyond a coding region and 

10 correct polyadenylation. Appropriate transcription terminators include those 
that are known to function in plants such as, for example, the CaMV 35S 
terminator, the tml terminator, the nopaline synthase terminator and the pea 
rbcS E9 terminator, all of which may be used in both monocotyledonous and 
dicotyledonous plants. 

15 Numerous sequences have been found to enhance gene expression 

from within the transcriptional unit and these sequences can be used in 
conjunction with selectable marker and other genes to increase expression of 
the genes in plant cells. For example, various intron sequences such as 
introns of the maize Adhl gene have been shown to enhance expression, 

20 particularly in monocotyledonous cells. In addition, a number of non- 
translated leader sequences derived from \/\ruses are also known to enhance 
exprssion, and these are particularly effective in dicotyledonous cells. 

c. Nucleic acids containing targeting sequences 
Development of a multicentric, particularly dicentric, chromosome 

25 typically is effected through integration of heterologous nucleic acid into 

heterochromatin, such as the pericentric heterochromatin, near or within the 
centromeric regions of chromosomes and/or into rDNA sequences. Thus, the 
development of artificial chromosomes may be facilitated by targeting the 
heterologous nucleic acid for integration into these regions, such as by 

30 introducing DNA, including, but not limited to, rDNA (e.g., rDNA intergenic 
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spacer sequence), satellite DNA, pericentric DNA and lambda phage DNA, 
into the recipient host cell. The targeting sequence may be introduced alone 
or with other nucleic acids, including but not limited to selectable markers. 
For example, a targeting sequence can be linked to a selectable marker. 
5 Examples of plant pericentric DNA and satellite DNA include, but are 

not limited to, pericentromeric sequences on tomato chromosome 6 [see, 
e.g.. Weide eta/. (1998) Mo/. Gen. Genet. 253:190-197], satellite DNA of 
soybean [see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian eta/. (1995) Plant MoL Biol. 23:857-862], pericentromeric 

lO DNA of Arab/dopsis thaliana [see, e.g., Tutois et al. (1999) Chromosome 
Res. 7:143-156], satellite DNA of arabidopsis thaliana (GenBank accession 
nos. AB033593 and X58104), pericentric DNA of the chickpea [C/cer 
arietinum L.; see e.g., Staginnus et al. (1999) Plant Mol. Biol. 33:1037- 
1050], satellite DNA on the rye B chromosome [see, e.g., Langdon et al. 

15 (2000) Genetics 75^:869-884], subtelomeric satellite DNA from Silene 

latifolia [see, e.g., Garrido-Ramos et al. (1999) Genome 42:442-446] and 
satellite DNA in the Saccharum complex [see, e.g., Alix eta/. (1998) 
Genome 4 1 :854-864]. 

Examples of rDNA targeting sequences include nucleic acids from 

20 plant and animal rDNA. Plant rDNA sequences include, but are not limited 
to, sequences contained in GENBANK Accession numbers D16103 [from 
rDNA of carrot (Daucus carota)], M23642 and M11585 [from rDNA encoding 
24S rRNA of rice (Oryza saf/Va}], M26461 [from from rDNA encoding 18S 
rRNA of rice (Oryza saf/Va)], Ml 6845 [from rDNA encoding 17S, 5.8S and 

25 25S rRNA of rice {Oryza sativa)], X82780 and X82781 [from rDNA encoding 
5S rRNA of potato (So/anum tuberosum)], AJ 131 161, AJ131 162, 
AJ131163, AJ131164, AJ131165, AJ131166 and AJ131167 [from rDNA 
encoding 5S rRNA of tobacco (Nicotiana tabacum], L36494 and U31016 
through U31030 [from rDNA encoding 5S rRNA of barley {Hordeum 

30 spontaneumU, U31 004 through U31015 and U31031 [from rDNA encoding 
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5S rRNA of barley (Hordeum bulbosum)], 21 1759 [from rDNA encoding 5.8S 
rRNA of barley (Hordeum vu/gare)], X16077 (from rDNA encoding 18S rRNA 
of Arabidopsis thaliana), M65137 (rDNA encoding 5S rRNA of Arabidopsis 
thaliana), AJ232900 (from rDNA encoding 5.8S rRNA of Arab/dops/s 
5 thaliana) and X52320 (from Arabidopsis thaliana genes for 5.8S and 25S 
rRNA with an 1 8S rRNA fragment). 

Intergenic spacer regions of plant rDNA include, but are not limited to 
sequences contained in GENBANK Accession numbers S70723 (from the 5S 
rDNA of barley (Hordeum vulgare)), AF0131O3 and X03989 (from maize 

10 (Zea mays)), X65489 (from potato (Solanum tuberosum)), X52265 (from 

tomato (Lycopersicon esculentum)) , AF177418 (from Arabidopsis negfecta), 
AF1 77421 and AF17422 (from Arabidopsis halleri), A71562, X155BO, 
X52631, U43224, X52320, X52636 and X52637 (from Arabidopsis 
thaliana; see Gruendler et ah (1991) J. MoL Biol. 22 7:1 209-1 222 and 

15 Gruendler et aL (1989) Nucleic Acids Res. 1 7:6395-6396), X54194 [from 
rice (Oryza sativa)] Y08422 and D76443 [from tobacco (Nicotiana 
tabacum)], AJ243073 [from wheat (Triticum boeoticum)] and X07841 [from 
wheat (Triticum aestivum)]. Sequences of intergenic spacer regions of plant 
rDNA further include sequences from rye [see Appets et aL (1986) Can. J. 

20 Genet. Cytol. 25:673-685], wheat [see Barker et al. (1988) J. MoL BioL 

201A-M and Sardana and Flavell (1996) Genome 33:288-292], radish [see 
Delcasso-Tremousaygue et aL (1988) Eur. J. Biochem. 7 72:767-776], Vicia 
faba and Pisum sativum [see Kato et al. (1990) Plant MoL BioL 74:983-993], 
mung bean [see Gerstner et al. (1988) Genome 50:723-733; and Schiebel et 

25 al. (1989) MoL Gen. Genet. 2 75:302-307], tomato [see Schmidt-Puchta et 

aL (1989) Plant MoL Biol. 75:251-253], Hordeum bulbosum [see Procunier et 
aL (1990) Plant MoL BioL 75:661-663], Lens culinaris Medik., and other 
legume species [see Fernandez et aL (2000) Genome 43:597-603] and 
tobacco [see U.S. Patent Nos. 6,100,092 and 6,096,546 and PCT 

30 Application Publication No. WO99/66058; Borysyuk et aL (1997) Plant MoL 
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Biol. 35:655-660); Borysyuk eta/. (2000) Nature Biotechnology 75:1303- 
1 306). 

Mammalian rDNA sequences include, but are not limited to, DNA of 
GENBANK accession no. X82564 and portions thereof, the DNA of 
5 GENBANK accession no. U 13369 and portions thereof and DNA sequences 
provided in PCT Application Publication No. W097/40183 (particularly SEQ. 
ID. NOS. 18-24 of WO97/40183). A particular vector for use in directing 
integration of heterologous nucleic acid into chromosomal rDNA is pTERPUD 
(see PCT Application Publication No. W097/40183). Satellite DNA 

10 sequences can also be used to direct the heterologous DNA to integrate into 
the pericentric heterochromatin. For example, vectors pTEMPUD and 
pHASPUD, which contain mouse and human satellite DNA, respectively (see 
PCT Application Publication No. WO97/40183), are examples of vectors that 
may be used for introduction of heterologous nucleic acid into cells for de 

15 novo chromosome formation leading to artificial chromosomes. 

3. Methods for introduction of heterologous nucleic acids into host 
cells 

Any methods known in the art for introducing heterologous nucleic 
acids into host cells may be used in the methods of preparing artificial 

20 chromosomes. The particular method used may depend on the type of cell 
into which the heterologous nucleic acid is being transferred. For example, 
methods for the physical introduction of nucleic acids into plant cells, for 
example, protoplasts and plant cells in culture, include, but are not limited to 
polyethylene glycol (PEG)-mediated DNA uptake, electroporation, lipid- 

25 mediated delivery, including liposomes, calcium phosphate-mediated DNA 
uptake, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation and combinations of these methods, for example 
methods utilizing combinations of calcium phosphate and PEG for DNA 
uptake or methods utilizing a combination of electroporation, PEG and heat 

30 shock (see, e.g., U.S. Patent Nos. 5,231,019 and 5,453,367). Physical 
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methods such as these are known in the art and are effective in introducing 
DNA into a variety of dicotyledonous and monocotyledonous plants [see, 
e.g., Paszkowski et a/. (1984) EMBO J. 3:2717-2722; Potrykus et al. (1985) 
Mol. Gen. Genet. 199: 169- 177; Reich et aL (1986) Biotechnology <2:1001- 
5 1004; Klein et aL (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 
Paszkowski et aL (1989) in Cell Culture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et al. 
(1994) Plant J. 5:941-948]. 

10 In addition to these methods for the introduction of nucleic acids into 

plant cells based on physically, mechanically or chemically meidated 
processes, it is possible to introduce nucleic acids into plant cells by 
biological methods, such as those utilizing Agrobacterium . In this method, 
nucleic acid sequences located adjacent to T-DNA border repeats can be 

15 inserted into the genome of a plant cell, typically dicotyledonous plant cells, 
by utilizing the encoded function for DNA transfer found in the genus 
Agrobacterium . This method has also been shown to work for some 
monocotyledonous plant cells, such as rice cells. 

Any method for introducing nucleic acids into plant cells can be used 

20 in the generation of artificial chromosomes, provided the method is capable 

of introducing the nucleic acid into an amplifiable region of a chromosome, 

for example, heterochromatin, and particularly in close proximity to a 

megareplicator region of a plant chromosome. 

a. Agrobacterium-mediated introduction of nucleic acids 
25 into piant cells 

Agrobacterium-mediated transformation is particularly well -suited for 

transformation of dicotyledons because of its high efficiency of 

transformation and its broad utility with many different species, including 

tobacco, tomato (see, e.g., European Patent Application no. O 249 432), 

30 sunflower, cotton (see, e.g., European Patent Application no. 0 317 511), 
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oilseed rape, potato, soybean, alfalfa and poplar (see, e.g., U.S. Patent No. 
4,795,855) (see also PCT Application Publication no. WO87/07299 with 
respect to transformation of Brassica). Agrobacterium-medxated 
transformation has also been used to transfer nucleic acids into 
5 monocotyledonous plants. Agrobacterium-med\ated transformation of 

Ch/orophytum capense and Narcissus cv "Paperwhite" [see, e.g., Hooykaas- 
Van Slogteren et al. (1984) Nature 3/7:763-764], corn and wheat [see, e.g., 
U.S. Patent Nos. 5,164,310, 5,187,073 and 5,177,010 and Mooney eta/. 
(1991) Plant Cell, Tissue, Organ Culture 25:209-218], rice [see, e.g., Raineri 

10 et al. (1990) Bio/Technology 8:33-38 and Chan et al. (1993) Plant Mol. Biol. 
22:491-506] and barley [see, e.g., Tingay eta/. (1997) The Plant J. 
77:1369-1376 and Qureshi et al. (1998) Proc. 42nd Conference of 
Australian Society for Biochemistry and Molecular Biology, September 28- 
October 1, 1998, Adelaide Australia] has been reported. 

15 Agrobacterium-med'iated delivery of nucleic acids is based on the 

capacity of certain Agrobacterium strains to introduce a part of their Ti 
(tumor-inducing) plasmid, i.e., the transforming DNA or T-DNA, into plant 
cells and to integrate this T-DNA into the genome of the cells. The part of 
the Ti plasmid that is transferred and integrated is delineated by specific DNA 

20 sequences, the left and right T-DNA border sequences. The natural T-DNA 
sequences between these border sequences can be replaced by foreign DNA 
[see, e.g., European Patent Publication 116 718 and Deblaere et al. (1987) 
Meth. Enzymol. 755:277-293]. 

When Agrobacterium is used for transformation, the heterologous 

25 nucleic acid being transferred typically is cloned into a plasmid that contains 
T-DNA border regions and is replicated independently of the Ti plasmid 
(referred to as the binary vector system) or the heterologous nucleic acid is 
inserted between the T-DNA borders of the Ti plasmid (referred to as the co- 
integrate method). In co-integrate methods, these vectors are be integrated 

30 into the Ti or Ri plasmid by homologous recombination owing to sequences 
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that are homologus to sequences within the T-DNA region of the Ti or Ri 
plasmid. The Tf or Ri plasmid also contains the vir region necessary for 
transfer of the T-DNA. 

Intermediate vectors cannot replicate in Agrobacteria. The 
5 intermediate vector can be transferred into Agrobacterium by means of a 
helper plasmid (conjugation, see Fraley eta/. (1983) Proc. Nat/. Acad. Sci. 
USA 50:4803). This method, typically referred to as triparental mating, 
introduces the heterologous nucleic acid sequence into the bacterium and 
allows for selection of a homologous recombination event that produces the 

10 desired Agrobacterium genotype. The triparental mating procedure typically 
employs Escherichia co/i carrying the recombinant intermediate vector and a 
helper E. co/i strain which carries a plasmid that is able to mobilize the 
recombinant intermediate vector to the target Agrobacterium strain. A 
modified Ti or Ri plasmid is obtained from the transfer and selection process, 

15 which contains a heterologous nucleic acid sequence located within the T- 
DNA region. The resultant Agrobacterium strain is capable of transferring 
the heterologous nucleic acid to plant cells. 

Binary vectors can replicate both in E. co/i and Agrobacterium . They 
typically contain a selection marker gene and a linker or polylinker which are 

20 flanked by the right and left T-DNA border regions and can be transformed 
directly into Agrobacterium [see, e.g., Hofgen and Wilmitzer (1988) Nuc. 
Acids. Res. 76:9877 and Holsters et at. (1978) Mo/. Gen. Genet. 763:181- 
1 87] or introduced through triparental mating. The Agrobacterium host cell 
contains a plasmid carrying a vir region needed for transfer of the T-DNA into 

25 a plant cell [see, e.g., White in P/ant Biotechnology, eds. Kung, S. and 

Arntzen, C.J., Butterworth Publishers, Boston, Mass., (1989) p. 3-34 and 
Fraley in Piant Biotechno/ogy , eds. Kung, S. and Arntzen, C.J., Butterworth 
Publishers, Boston, Mass., (1989) p. 395-407]. 

Agrobacterium-medlBted transformation typically involves the transfer 

30 of a binary vector carrying the heterologous nucleic acid of interest to an 
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appropriate Agrobacterium strain, which may depend on the complement of 
vir genes carried by the host Agrobacterium strain either on a co-resident Ti 
plasmid or chromosomally (see, e.g., Uknes et al. (1993) Plant Cell 5:159- 
1 69). The transfer of a recombinant binary vector to Agrobacterium is 
5 acomplished by a triparental mating procedure using Eschreichia co/i carrying 
the recombinant binary vector, a helper E. co/i strain which carries a plasmid 
which is able to mobilize the recombinant binary vector to the target 
Agrobacterium strain. Alternatively, the recombinant binary vector can be 
transferred to Agrobacterium by DNA transformation (see, e.g., Hofgen & 

10 Willmitzer (1988) Nuc. Acids. Res. 76:9877). 

Many vectors are available for transfer of nucleic acids into 
Agrobacterium tumefaciens [see, e.g., Rogers et al. (1987) Methods in 
Enzymol. 753:253-277]. These typically carry at least one T-DNA border 
sequence and include vectors such as pBIIM19 [see, e.g., Bevan (1984) Nuc. 

15 Acids. Res. 72:8711-8721], Typical vectors suitable for Agrobacterium 

transformation include the binary vectors pCIB200 and pCIB2001, as well as 
the binary vector pClBIO and hygromycin selection derivatives thereof (see, 
e.g., U.S. Patent No. 5,639,949). Other vectors that can be employed are 
the pCambia vectors (see www.cambia.org), including, for example, 

20 pCambia 3300 and pCambia 1302 (GenBank Accession No. AF234298). 

A particularly useful Ti plasmid cassette vector for the transformation 
of dicotyledonous plants contains the enhanced CaMV35S promoter (EN35S) 
and the 3' end, including polyadenylation signals, of a soybean gene 
encoding the a subunit of /ff-conglycinrn. Between these two elements is a 

25 multilinker containing multiple restriction sites for the insertion of genes of 
interest (see, e.g., U.S. Patent No. 6,023,013). The vector can contain a 
segment of pBR322 which provides an origin of replication in E. coli and a 
region for homologous recombination with the disarmed T-DNA in 
Agrobacterium strain ACO; the oriV region from the broad host range 

30 plasmid RK1; the streptomycin/spectinomycin resistance gene from Tn7; and 
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a chimeric NPTII gene, containing the CaMV35S promoter and the nopaline 
synthase (NOS) 3' end, which provides kanamycin resistance in transformed 
plant cells. Optionally, the enhanced CaMV35S promoter may be replaced 
with the 1.5 kb mannopine synthase (MAS) promoter (see, e.g., Velton et al. 
(1984) EMBO J. 3:2723-2730). After incorporation of a DNA construct into 
the vector, it is introduced into A, tumefaciens strain ACO which contains a 
disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected and 
subsequentally may be used to transform a dicotyledenous plant. 

Transformation of the target plant species by recombinant 
Agrobacterium usually involves co-cultivation of the Agrobacterium with 
explants from the plant and follows published protocols. Methods of 
inoculation of the plant tissue vary depending upon the plant species and the 
Agrobacterium delivery system. The plant tissue can be either protoplast, 
callus or organ tissue, depending on the plant species. A widely used 
approach is the leaf disc procedure which can be performed with any tissue 
explant that provides a good source for initiation of whole plant 
differentiation (see, e.g., Horsch eta!, in Plant Molecular Biology Manual A3, 
Kluwer Academic Publishers, Dordrecht (1988) p. 1-9 and U.S. Patent No. 
6,136,320). The addition of nurse tissue may be desirable under certain 
conditions. There are multiple choices of Agrobacterium strains (including, 
but not limited to, A. tumefaciens and A. rhizogenes) and plasmid 
construction strategies that can be used to optimize genetic transformation 
of plants. Transformed tissue carrying an antibiotic or herbicide resistance 
marker present between the binary plasmid and T-DNA borders can be 
regenerated on selectable medium. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE (see 
Fraley et at. (1985) Bio/Technology 3:629-635). For construction of ACO, 
the starting Agrobacterium strain was A208 which contains a nopaline-type 
Ti plasmid. The Ti plasmid was disarmed in a manner similar to that 
described by Fraley et al. (1985) Bio/Technology 3:629-635) so that 
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essentially afl of the native T-DNA was removed except for the left border 
and a few hundred base pairs of T-DNA inside the left border. The remainder 
of the T-DNA extending to a point just beyond the right border was replaced 
with a piece of DNA including (from left to right) a segment of pBR322, the 
5 oriV region from plasmid RK2, and the kanamycin resistance gene from 
Tn601. The pBR322 and oriV segments are similar to these segments and 
provide a region of homology for cointegrate formation (see U.S. Patent No. 
6,023,013). Another useful strain of Agrobacterium is A. tumefaciens strain 
GV3101/pMP90 [see, e.g., Koncz and Schell (1986) Mol. Gen. Genet. 

10 20^:383-396]. 

Advances in Agrobacterium-medlated transfer allow introduction of 
larger segments of nucleic acids [see, e.g., Hamilton (1997) Gene 4; 200(1 - 
2):107-116; Hamilton eta/. (1996) Proc. Natl. Acad. ScL U.S.A. 33:9975- 
9979; Liu et at. (1999) Proc. Natl. Acad. ScL U.S.A. 35:6535-6540]. The 

15 vectors used in these methods are designed to have the characteristics of 
both bacterial artificial chromosomes (BACs) and binary vectors for 
Agrobacter/um-mediated transformation. Therefore, somewhat larger DNA 
fragments cloned in the T-DNA region can be transferred into a plant genome 
by Agrobacterium. Binary bacterial artificial chromosome (BIBAC) vector 

20 BIBAC2 (see U.S. Patent No. 5,733,744; available from the Plant Science 
Center, Cornell University) and the transformation-competent bacterial 
artificial chromosome (TAC) vector pYLTAC7 (available from the Plant Cell 
Bank of the RIKEN Gene Bank, Tsukuba, Japan) are examples of the types of 
vectors that may be used in transferring larger segments of nucleic acids, 

25 particularly heterologous nucleic acids containing targeting and/or selectable 
marker sequences as described herein, into plants via Agrobacterium- 
mediated DNA transfer processes. 

Introduction of heterologous nucleic acids into plant cells without the 
use of Agrobacterium circumvents the requirements for T-DNA sequences in 

30 the transformation vector and consequently vectors lacking these sequences 
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can be utilized in addition to vectors containing T-DNA sequences. 
Techniques for nucleic acid transfer that do not rely on Agrobacterium 
include transformation via particle bombardment, direct DNA uptake {e.g. , 
PEG, lipids, electroporation) and mechanical methods such as microinjection 
5 or silicon "whiskers". The choice of vector that may be used in introduction 
of heterologous nucleic acids into plant cells can involve largely on the 
preferred selection for the species being transformed. Typical vectors 
suitable for transformation without Agrobacterium include pCIB3064, 
pSOG19 and pSOG35 (see, e.g., U.S. Patent No. 5,639,949), or common 

10 plasmid, phage or cosmid vectors. 

b. Direct DNA Uptake 
Introduction of heterologous nucleic acids into plant cells may be 
achieved using a variety of methods that facilitate direct DNA uptake, 
including calcium phosphate precipitation, polyethylene glycol (PEG) 

15 treatment, electroporation, and combinations thereof [see, e.g., Potrykus et 
a/. (1985) Mo!. Gen. Genet. 799:133; Lorz eta!. (1985) Mo!. Gen. Genet. 
199:173; Fromm et aL (1985) Proc. Natl. Acad. ScL U.S.A. 52:5824-5828; 
Uchimiya eta!. (1986) Mo!. Gen. Genet. 204:204; Callis eta!. (1987) Genes 
Dev. 7:1183-2000; Callis eta!. (1987) Nuc. Adds Res. /5.5823-5831 ; 

20 Marcotte et at. (1988) Nature 355:454, Toriyama et aL (1988) 

Bio/Techno!ogy 5:1072-1074; Haim eta!. (1985) Mo!. Gen. Genet. 799:161- 
168; Deshayes eta!. (1985) EMBO J. 4: 273 1-2737; Krens eta!. (1982) 
Nature 296:72-74; Crossway eta!. (1986) Mo!. Gen. Genet. 20:179]. 

Typically, plant protoplasts are used for direct DNA uptake, or in some 

25 instances plant tissue that has been treated to remove a portion or the 

majority of the cell wall (see, e.g., PCT Publication No. W093/21335 and 
U.S. Patent No. 5,472,869). Removal of the cell wall is believed to facilitate 
entry of DNA into plant cells, although in some instances electroporation may 
be used to introduce DNA into specialized plant cells, e.g., electroporation of 

30 pollen, without first removing the cell wall. 
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Techniques for the preparation of callus and protoplasts from maize, 

transformation of protoplasts using PEG or electroporation, and the 

regeneration of maize plants from transformed protoplasts are found, for 

example, in European Patent Application nos. 0 292 435 and O 392 225 and 

5 PCT Application Publication no. W093/07278. Transformation of rice can 

also be undertaken by direct gene transfer techniques utilizing protoplasts 

[see, e.g., Zhang et al. (1988) Plant Cell Rep. 7:379-384; Shimamoto et al. 

(1989) Nature 338:274-277; Datta et al. (1990) Biotechnology 8:736-740]. 

The regeneration of fertile transgenic barley by direct DNA transfer to 

10 protoplasts is described, for example, by Funatsuki eta/. [(1995) Theor. 

Appl. Genet. 57:707-712]. Other plant species, including tobacco and 

Arabidopsis, may also serve as sources of protoplasts for use in introduction 

of heterologous nucleic acids into plant cells. 

c. Particle bombardment-mediated introduction of nucleic 
15 acids into plant cells 

Microprojectile bombardment of plant cells can be an effective method 

for the introduction of nucleic acids into plant cells. In these methods, 

nucleic acids are carried through the cell wall and into the cytoplasm on the 

surface of small, typically metal, particles [see, e.g., Klein et al. (1987) 

20 Nature 327:70; Klein et al. (1988) Proc. Natl. Acad. ScL U.S.A. 55:8502- 
8505, Klein et aL in Progress in Plant Cellular and Moiecular Biology, eds. 
Nijkamp, H.J.J., Van der Plas, J.H.W., and Van Aartrijk, J., Kluwer 
Academic Publishers, Dordrecht, (1988), p. 56-66; Seki et ah (1999) Mol. 
BiotechnoL 7 7:251-255; and McCabe et al. (1988) Bio/Technology 5:923- 

25 926], Particles may be coated with nucleic acids and delivered into cells by 
a propelling force. Exemplary particles include those containing tungsten, 
gold or plantinum, as well as magnesium sulfate crystals. The metal 
particles can penetrate through several layers of cells and thus allow the 
transformation of cells within tissue explants. 
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ln an illustrative embodiment [see, e.g., U.S. Patent No. 6,023,013} of 
a method for delivering nucleic acids into plant cells, e.g., maize cells, by 
acceleration, a Biolistics Particle Delivery System may be used to propel 
particles coated with DNA or cells through a screen, such as a stainless steel 
5 or Nytex screen, onto a filter surface covered with plant (e.g., corn) cells 
cultured in suspension. The screen disperses the particles so that they are 
not delivered to the recipient cells in large aggregates. The intervening 
screen between the projectile apparatus and the cells to be bombarded may 
reduce the size of projectile aggregates and may contribute to a higher 
10 frequency of transformation by reducing damage inflicted on the recipient 
cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 
filters or solid culture medium. Alternatively, immature embryos or other 
target cells may be arranged on solid culture medium. The cells to be 
15 bombarded are typically positioned at an appropriate distance below the 

macroprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 

The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 
20 transformants. Both the physical and biological parameters for bombardment 
can be important in this technology. Physical factors include those that 
involve manipulating the DNA/microprojectile precipitate or those that affect 
the flight and velocity of either the macro- or microprojectifes. Biological 
factors include ail steps involved in manipulation of cells before and 
25 immediately after bombardment, the osmotic adjustment of target cells to 

help alleviate the trauma associated with bombardment, and also the nature 
of the transforming nucleic acid, such as linearized DNA or intact supercoiled 
p\asm)ds. 

Physical parameters that may be adjusted include gap distance, flight 
30 distance, tissue distance and helium pressure. In addition, transformation 
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may be optimized by adjusting the osmotic state, tissue hydration and 

subculture stage or cell cycle of the recipient cells. 

Techniques for transformation of A188-derived maize line using 

particle bombardment are desribed in Gordon-Kamm et a/. [(1990) Plant Cell 

5 2:603-618] and Fromm efa/. [(1990) Biotechnology 5:833-839]. 

Transformation of rice may also be accomplished via particle bombardment 

[see, e.g., Christou et aL (1991) Biotechnology 3:957-962]. Particle 

bombardment may also be used to transform wheat [see, e.g., Vastl et aL 

(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

lO term regenerable callus; and Weeks et ah (1993) Plant Physiol. 102:1017- 

1084 for transformation of wheat using particle bombardment of immature 

embryos and immature embryo-derived callus]. The production of transgenic 

barley using bombardment methods is described, for example, by Koprek et 

aL [(1 996) Plant ScL 773:79-91]. 

15 d. Electroporation-mediated introduction of nucleic acids 

into plant cells 

The application of brief, high-voltage electric pulses to a variety of 
animal and plant cells leads to the formation, of nanometer-sized pores in the 
plasma membrane. Nucleic acids are taken directly into the cell cytoplasm 

20 either through these pores or as a consequence of the redistribution of 
membrane components that accompanies closure of the pores. 
Electroporation can be extremely efficient and can be used both for transient 
expression of cloned genes and for the establishment of cell lines that carry 
integrated copies of the gene of interest. 

25 Certain cell wall-degrading enzymes, such as pectin-degrading 

enzymes, may be employed to render the target recipient cells more 
susceptible to transformation by electroporation than untreated cells. 
Alternatively, recipient cells may be more susceptible to transformation by 
mechanical wounding. To effect transformation by electroporation, friable 

30 tissues such as a suspension culture of cells or embryonic callus may be 
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used or immature embryos or other organized tissues may be directly 
transformed [see, e.g., Fromm et al. (1986) Nature 3/5:791-793; and 
Neuman et al. (1982) EMBO J. 7:841-845]. 

e. Microinjection-mediated introduction of nucleic acids into 
5 plant cells 

In microinjection techniques, nucleic acids are mechanically injected 

directly into cells using very small micropipettes. For example, microinjection 

of protoplast cells with foreign DNA for transformation of plant cells has 

been reported for barley and tobacco [see, e.g., Holm eta/. (2000) 

10 Transgenic Fes. 9:21-32 and Schnorf eta/. Transgenic Res. 7:23-30]. 

f . Lipid-mediated introduction of nucleic acids into plant 
cells 

In lipid-mediated transfer, nucleic acids are contacted with lipids 
and/or encapsulated in lipid-containing structures, including but not limited to 

1 5 liposomes, and the liposome-containing nucleic acids are fused with plant 
protoplasts. The fusion can occur in the presence or absence of a fusogen, 
such as PEG. Lipid-mediated transformation of plant protoplasts has been 
reported [see e.g., Fraley and Papahadjopoulos (1982) Curr. Top. Microbiol. 
Immunol. 96:171-191; Deshayes et al. (1985) EMBO J. 4:2731-2737 and 

20 Spoerlein and Koop (1991) Theor. AppL Genetics 53:1-5]. 

g. Other methods of introduction of nucleic acids into plant 
cells 

Other methods to physically introduce nucleic acid into plant cells may 
be used, including silicon carbide fibers ("whiskers") that are used to pierce 
25 plant cell walls thereby facilitating nucleic acid uptake, the use of sound 
waves to introduce holes in plant cell membranes to facilitate nucleic acid 
uptake (e.g., sonoporation) and the use of laser beams to open holes in cell 
membranes facilitating the entry of nucleic acids {e.g. , laser poration). 

Nucleic acids may also be imbibed by hydrating plant tissue, providing 
30 another method for nucleic acid uptake into plant cells [see, e.g., Simon 

(1974) New Phytotogtst 37:377-4201. For example, nucleic acids may be 
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taken into cereal and legume seed embryos by inbibition [see, e.g., Toepfer 

eta/. (1989) The Plant Cell I -A 33-1 39]. 

4. Treatment of cells into which heterologous nucleic acids have 
been introduced 

5 Cells into which heterologous nucleic acids have been introduced may 

be analyzed for de novo formation of artificial chromosomes described herein 
such as may result from amplification of chromosomal segments occurring in 
connection with integration of heterologous nucleic acids into chromosomes. 
Typically, amplification occurs over multiple generations of cell division 

10 leading to the formation of detectable changes in chromosome structure. 
Therefore, transfected cells are typically cultured through multiple cell 
divisions, from about 5 to about 60, or about 5 to about 55, or about 10 to 
about 55, or about 25 to about 55, or about 35 to about 55 cell divisions 
following introduction of nucleic acid into a cell. Artificial chromosomes 

15 may, however, appear after only about 5 to about 1 5 or about 10 to about 

1 5 cell divisions. Cells into which heterologous nucleic have been introduced 
may be treated in a variety of ways prior to or during analysis thereof for the 
presence of artificial chromosomes. 

For example, cells into which nucleic acid encoding a selectable 

20 marker required for growth in the presence of a selection agent has been 
transferred can be treated as the exemplified cells herein to facilitate 
generation of multicentric chromosomes, and fragmentation thereof, and/or 
the generation of artificial chromosomes. The cells may be grown in the 
presence of an appropriate concentration of selection agent, which may be 

25 determined empirically by growing untransfected cells in varying 

concentrations of the agent and identifying concentrations sufficient to 
prevent cell growth and/or facilitate amplification of chromosomal segments. 
Transfected cells may be grown in selective media for numerous generations 
and cell lines can be established that contain the introduced nucleic acid. 

30 The concentration of selection agent may also be increased over several 
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generations to promote amplification of a region of a chromosome into which 
heterologous nucleic acid integrated. Transfected cells may also be treated 
to destabilize the chromosomes to facilitate generation and fragmentation of 
a multicentric, typically dicentric, chromosome. 
5 Additional heterologous nucleic acid, e.g. , nucleic acid encoding a 

selectable marker, may also be introduced into the transfected cells to 
facilitate amplification of chromosomal segments, such as the pericentric 
heterochromatin, contained in, for example, a fragment released from a 
multicentric chromosome (e.g., a formerly dicentric chromosome) ,^and 

10 generation of a heterochromatic artificial chromosome. The resulting 

transformed cells can then be grown in the presence of a selection agent, 
which may be a second agent (if the heterologous nucleic acid introduced 
into the transfected cells encodes a selectable marker different from any 
selectable marker encoded by heterologous nucleic acid initially transferred 

15 into the original host cells), with or without the first selection agent. 

Cells into which nucleic acids have been introduced may also be 
subjected to cell sorting. For example, protoplasts may be prepared from 
transfected plant cells or calli and subjected to sorting. If the sorting is 
conducted prior to chromosomal analysis of the cells for the presence of 

20 artificial chromosomes, it provides a population of transfected cells that may 
be enriched for artificial chromosomes and thus facilitates the subsequent 
chromosomal analysis of the cells. 

The sorting is based on the presence of a detectable marker in the 
cells, as provided for by the introduced nucleic acid, which can provide the 

25 basis for isolating such cells from cells that do not contain the heterologous 
nucleic acid. For example, the nucleic acid introduced into the plant cells 
may contain nucleic acid encoding a fluorescent protein, such as a green, red 
or blue fluorescent protein, which may be used for selection, by flow 
cytometry and other methods, of recipient cells that have taken up and 

30 express the nucleic acid at readily detected levels. 
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ln an exemplary protocol, GFP fluorescence of transfected cell cultures 
may be monitored visually during culture using an inverted microscope 
equipped with epifluorescence illumination (Axiovert 25; Zeiss, (North York 
ON), and #41017 Endow GFP filter set (Chroma Technologies, Brattleboro, 
5 VT). Enrichment of GFP expressing populations can be carried out as 

. follows. Cell sorting may be carried out, for example, using a FACS Vantage 
flow cytometer (Becton Dickinson Jmmunocytometry Systems, San Jose, 
CA) equipped with turbo-sort option and 2 Innova 306 lasers (Coherent, Palo 
Alto CA). For cell sorting a 70 jjm nozzle can be used. The buffer can be 

10 changed to PBS (maintained at 20 p.s.i.)., GFP may be excited with a 488 
nm laser beam and excitation detected in FL1 using a 500 EFLP filter. 
Forward and side scattering can be adjusted to select for viable cells. Gating 
parameters may be adjusted using untransfected cells as negative control 
and GFP CHO cells as positive control. 

1 5 For the first round of sorting, transfected cells may be harvested post- 

transfection (e.g., about 7-14 days post-transfection), converted to 
protoplasts, resuspended in about 10 ml of growth medium and sorted for 
GFP-expressing populations using parameters described above. GFP-positive 
cells may be dispensed into a volume of about 5-10 ml of protoplast medium 

20 while non-expressing cells are directed to waste. The expressing cells may 

be cultured. Plant cells or calli can then be analyzed, for fluorescence in-situ 

hybridization screening. 

5. Analysis of transformed cells and identification and 
manipulation of artificial chromosomes 

25 Cells into which nucleic acids have been introduced, and which may 

or may not have been further treated as described herein, may be analyzed 
for indications of amplification of chromosomal segments, the presence of 
structures that may arise in connection with amplification and de novo 
artificial chromosome formation and/or the presence of desired artificial 

30 chromosomes as described herein. Analysis of the cells typically involves 
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methods of visualizing chromosome structure, including, but not limited to, G- 
and C-banding, PCR, Southern blotting and FISH analyses, using techniques 
described herein and/or known to those of skill in the art. Such analyses can 
employ specific labelling of particular nucleic acids, such as satellite DNA 
5 sequences, heterochromatin, rDNA sequences and heterologous nucleic acid 
sequences, that may be subject to amplification. During analysis of 
transfected cells, a change in chromosome number and/or the appearance of 
distinctive, for example, by increased segmentation arising from amplification 
of repeat units, chromosomal structures will also assist in identification of 

10 cells containing artificial chromosomes. The following description of events 
and structures that may be observed in analyzing cells for evidence of 
chromosomal amplification and/or the presence of artificial chromosomes is 
intended to be illustrative of the observations and considerations that may 
occur in the analysis of cells of any type, including mammalian and plant 

15 cells. It should be recognized that numerous types of structures may be 

formed during amplification of chromosomal segments and treatment of the 
cells. Additional, yet related, structures and variations of these structures 
are contemplated herein and are recognizable based on the descriptions and 
teachings of the generation and identification of artificial chromosomes 

20 presented herein. Each structure can be further manipulated, for example 
using procedures described herein, to derive additional chromosomal 
structures and compositions. 

Typically, de novo centromere formation occurs in cells upon 
integration of heterologous nucleic acids into the cell chromosomes and 

25 amplification of chromosomal and heterologous nucleic acids. The 

integration and amplification that gives rise to de novo centromere formation 
typically occurs at the centromeric region of the short arm of a chromosome, 
typically an acrocentric chromosome. By employing methods such as 
chromosome-staining methods, including FISH and G-and C-banding, it may 

30 be possible to identify a chromosome at which the process occurs. 
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The amplification can lead to the formation of multicentric, typically 
dicentric, chromosomes. Because of the presence of two or more 
functionally active centromeres on the same chromosome, regular breakages 
occur between the centromeres. Such specific chromosome breakages can 
5 give rise to the appearance of a chromosome fragment carrying a neo- 

centromere. The neo-centromere may be found on a minichromosome (neo- 
minichromosome), while a formerly dicentric chromosome may carry traces 
of the heterologous nucleic acid. 

a. The neo-minichromosome 

lO Breakage of a dicentric chromosome between the two functional 

centromeres can form at least two chromosomes, for example, a so-called 
minichromosome, and a formerly dicentric chromosome. Treatment of cells 
containing a dicentric chromosome, such as, for example, recloning, 
treatment with agents that destabilize the chromosomes, e.g., BrdU, and/or 

15 culturing under selective conditions, may facilitate breakage of the dicentric 
chromosome. Selection of transformed cells can yield cell lines containing a 
stable neo-minichromosome. The breakage of a multicentric, typically 
dicentric, chromosome in transformed cells, which separates the neo- 
centromere from the remainder of the endogenous chromosome, may occur, 

20 for example, in the G-band positive heterologous nucleic acid region as is 

suggested if traces of the heterologous nucleic acid sequences at the broken 
end of the formerly dicentric chromosome are observed. 

Multiple E-type amplification (amplification of euchromatin) may form a 
neo-chromosome, which separates from the remainder of the dicentric 

25 chromosome through a specific breakage between the centromeres of the 

dicentric chromosome. Inverted duplication of the fragment bearing the neo- 
centromere can result in the formation of a stable neo-minichromosome. The 
minichromosome is generally about at least 20-30 Mb in size. 

The presence of inverted chromosome segments can be associated 

30 with the chromosomes formed de novo at the centromeric region of a 
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chromosome. During the formation of the neo-minichromosome, the event 
leading to the stabilization of the distal segment of the chromosome that 
bears the duplicated neo-centromere may be the formation of its inverted 
duplicate. 

5 Although the neo-minichromosome typically carries only one functional 

centromere, both ends of the minichromosome can be heterochromatic, 
carrying, for example, satellite DNA sequences as discernable by in situ 
hybridization. Comparison of the G-band pattern of a chromosome fragment 
carrying the neo-centromere with that of a stable neo-minichromosome, can 

10 indicate that the neo-minichromosome is an inverted duplicate of the 
chromosome fragment that bears the neo-centromere. 

Cells containing a de /?ovo-formed minichromosome, which contains 
multiple repeats of the heterologous nucleic acids, can be used as recipient 
cells in cell transfection. Donor nucleic acids, such as heterologous nucleic 

15 acids containing DNA encoding a desired protein and DNA encoding a 

second selectable marker, can be introduced into the cells and integrated into 
the de novo-formed minichromosomes. To facilitate integration into the de 
novo-formed minichromosomes, the heterologous DNA may also contain 
sequences that are homologous to nucleic acids already present in the 

20 minichromosomes, which can, through homologous recombination, provide 
targeted integration into the minichromosome. Nucleic acids can also be 
integrated into the minichromosome through the use of site-specific 
recombinases by producing minichromosomes containing site-specific 
recombination sites as described herein. Integration can be verified by in situ 

25 hybridization and Southern blot analyses. Transcription and translation of 
heterologous DNA can be confirmed by primer extension, immunoblot 
analyses and reporter gene assays, if a reporter gene has been included in 
the heterologous DNA, using, for example, appropriate nucleic acid probes 
and/or product-specific antibodies. 
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The resulting engineered minichromosome that contains the heterolo- 
gous DNA can also be transferred, for example by cell fusion, into a recipient 
cell line to further verify correct expression of the heterologous DNA. 
Following production of the cells, metaphase chromosomes can be obtained, 
5 such as by addition of colchicine, and the minichromosomes purified using 
methods as described herein. The resulting minichromosomes can be used 
for delivery to specific cells of interest using any known method or methods 
for transferring heterologous nucleic acids into cells, particularly plant cells, 
and/or methods described herein. 

10 Thus, the neo-minichromosome is stably maintained in cells, replicates 

autonomously, and permits the persistent, long-term expression of genes 
under non-selective culture conditions, and in a whole, intact, regenerated 
plant. It also can contain megabases of heterologous known DNA that can 
serve as target sites for homologous recombination and integration of DNA 

15 of interest. The neo-minichromosome is, thus, a vector for the delivery and 
expression of nucleic acids to cells. 

Cell lines that contain artificial chromosomes, such as the 
minichromosome, the neo-chromosome, and the heterochromatic artificial 
chromosomes, are a convenient source of these chromosomes and can be 

20 manipulated, such as by cell fusion or production of microcells for fusion 
with selected ceil lines, to deliver the chromosome of interest into a 
multiplicity of cell lines, including cells from a variety of different plant 
species. 



Manipulation of cells containing a fragment released upon breakage of 
the dicentric chromosome (e.g., a formerly dicentric chromosome), for 
example, by introducing additional heterologous nucleic acids, including, for 
example, DNA encoding a second selectable marker and growth under 
30 selective conditions, can yield heterochromatic structures. Included among 



25 



b. 



Heterochromatin-containing and predominantly 
heterochromatic artificial chromosomes 
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such structures are compositions referred to as sausage chromosomes and 
megachromosomes. For example, a formerly dicentric chromosome may 
translocate to the end of another chromosome, such as an acrocentric 
chromosome- Additional heterologous nucleic acids added to cells containing 
5 a formerly dicentric chromosome can integrate into the pericentric 

heterochromatin of the formerly dicentric chromosome and be amplified 
several times with megabases of pericentric heterochromatic satellite DNA 
sequences forming a "sausage" chromosome carrying a newly formed 
heterochromatic chromosome arm. The size of this heterochromatic arm can 

10 vary, for example, between ~150 and —800 Mb in individual metaphases. 
The chromosome arm can contain four to five satellite segments rich in 
satellite DNA, and evenly spaced integrated heterologous "foreign" DNA 
sequences. At the end of the compact heterochromatic arm of the sausage 
chromosome, a less condensed euchromatic terminal segment may be 

15 observed. By capturing a euchromatic terminal segment, this new 

chromosome arm is stabilized in the form of the "sausage" chromosome. In 
subclones of sausage chromosome-containing cell lines, the heterochromatic 
arm of the sausage chromosome may become unstable and show continuous 
intrachromosomal growth, particularly after treatment with BrdU and/or drug 

20 selection to induce further H-type amplification. Jn extreme cases f the 
amplified chromosome arm can exceed 500 Mb or even 1000 Mb in size 
Ig'tgachromosome). Thus, the gigachromsome Is a structure in which a 
heterochromatic arm has amplified but not broken off from a euchromatic 
arm. 

25 In situ hybridization with, for example, biotin-labeled subfragments of 

the added heterologous nucleic acids may show a hybridization signal only in 
the heterochromatic arm of the sausage chromosome, indicating that the 
heterologous nucleic acid sequences are localized in the pericentric 
heterochromatin. 
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Gene expression, however, may be possible in the heterochrornatic 
environment of a sausage chromosome. The level of heterologous gene 
expression may be determined by Northern hybridization with a subfragment 
of the selectable marker gene. Reporter genes included in heterologous 
5 nucleic acids also provide a readily detectable product for use in evaluating 
gene expression in a sausage or other heterochrornatic or predominantly 
heterochromomatic chromosome. Southern hybridization of DNA isolated 
from subclones of sausage chromosome-containing cells with subfragments 
of reporter (and selectable marker) genes can show a close correlation 
10 between the intensity of hybridization and the length of the sausage 
chromosome. 

Cell lines containing sausage chromosomes can be manipulated to 
yield additional heterochrornatic structures and artificial chromosomes, 
including, for example, an artificial chromosome referred to as a 
15 megachromosome. Such manipulation includes fusion of the cell line with 
other cells and growth in the presence of one or more selection agents 
and/or BrdU. 

Cells with a structure, such as the sausage chromosome, can be 
selected and fused with a second cell line, including other plant and non- 
20 plant species [see, e.g., Dudits et al. (1976) Heriditas 52:121-123 for the 
fusion of human cells with carrot protoplasts and Wiegand et al. (1987) J. 
Cell. ScL fPt. 2^:145-149 for laser-induced fusion of plant protoplasts with 
mammalian cells] to eliminate other chromosomes that are not of interest. 
Structures such as sausage chromosomes formed during this process may be 
25 further manipulated, for example, by treating the cells with agents that 

destabilize chromosomes, e.g., BrdU, so that the heterochrornatic arm forms 
a chromosome that is substantially heterochrornatic (e.g., a 
megachromosome). Structures such as the gigachromosome in which the 
heterochrornatic arm has amplified but not broken off from the euchromatic 
30 arm, may also be observed. Further manipulation, such as fusions and 
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growth in selective conditions and/or BrdU treatment or other such 
treatment, can lead to fragmentation of the megachromosome to form 
smaller chromosomes that have the amplicon as the basic repeating unit. 

If a cell with a sausage chromosome is selected, it can be treated with 
5 an agent, such as BrdU, that destabilizes the chromosome so that the 
heterochromatic arm forms a chromosome that is substantially 
heterochromatic {e.g., a megachromosome). Prior to treating the cell with 
BrdU, it can be fused with another cell line carrying chromosomes of another 
species, in order to eliminate chromosomes of the original host cell and 

10 obtain a cell in which the only chromosome from the host cell is the sausage 
chromosome. The resulting hybrid cells can be grown in the presence of 
multiple selection agents to select for those that carry the sausage 
chromosome. In situ hybridization with chromosome painting probes that 
detect chromosomes of both the host cell species and the species of cell to 

15 which the host cell was fused can provide an indication of the chromosomal 
make up of the hybrid cells. 

Cell lines containing a sausage chromosome can be treated with a 
destabilizing agent, such as BrdU, followed by growth in selective medium 
and retreatment with BrdU. The BrdU treatments appear to destabilize the 

20 genome, resulting in a change in the sausage chromosome as well. A cell 
population in which a further amplification has occurred will arise. In 
addition to the heterochromatic arm (which may, for example, be ~ 100150 
Mb) of the sausage chromosome, an extra centromere and another (for 
example, ~ 150-250 Mb) heterochromatic chromosome arm may be formed. 

25 By the acquisition of another euchromatic terminal segment, a new 
submetacentric chromosome (e.g., megachromosome) can form. 

Megachromosomes may also be produced through regrowth and 
establishment of sausage chromosome-containing cells in selective medium. 
Repeated BrdU treatment can produce cell lines that have a dwarf 

30 megachromosome (for example, about 150-200 Mb), a truncated 
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megachromosome (for example, about 90-120 Mb), or a micro- 
megachromosome (for example, about 50-90 Mb). Cell lines containing 
smaller truncated megachromosomes can be used to generate even smaller 
megachromosomes, e.g. , —10-30 Mb in size. This may be accomplished, 
5 for example, by breakage and fragmentation of a micro-megachromosome 
through exposing the cells to X-ray irradiation, BrdU or telomore-directed in 
vivo chromosome fragmentation. 

Apart from the euchromatic terminal segments and the integrated 
foreign nucleic acid, the whole megachromosome, as well as other related 

10 types of predominantly heterochromatic artificial chromosomes, is 

constitutive heterochromatin. This can be demonstrated by C-banding of the 
megachromosome, which results in positive staining characteristic of 
constitutive heterochromatin. It can contain tandem arrays of satellite DNA. 
In a particular example, satellite DNA blocks are organized into a giant 

15 palindrome (amplicon) carrying integrated exogenous nucleic acid sequences 
at each end. It is of course understood that the specific organization and 
size of each component can vary among species, and also the chromosome 
in which the amplification event initiates. 

In general, a clear segmentation may be observed in one or more arms 

20 of an amplification-based chromosome. For example, a megachromosome 
may contain building units that are amplicons of, for example, —30 Mb 
containing satellite DNA with the integrated "foreign" DNA sequences at 
both ends. The —30 Mb amplicons may be composed of two —15 Mb 
inverted doublets of — 7.5 Mb satellite DNA blocks, which are separated 

25 from each other by a narrow band of non-satellite sequences. The wider 
non-satellite regions at the amplicon borders may contain integrated, 
exogenous (heterologous) nucleic acid, while any narrow bands of non- 
satellite DNA sequences within the amplicons may be integral parts of the 
pericentric heterochromatin of the host chromosomes. The sizes of the 

30 building units of a megachromosome or other amplification-based 
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chromosome may vary depending on the species of the host chromosome 
from which the artificial chromosome was generated. 

Further BrdU treatment can produce cell and/or calli that include cells 
with a truncated megachromosome. The megachromosome can be further 
5 fragmented in vivo using a chromosome fragmentation vector to ultimately 
produce a chromosome that comprises a smaller stable replicable unit, for 
example, about 1 5 Mb-60 Mb, containing one to four megareplicons. 

Apart from the euchromatic terminal segments, the whole 
megachromosome is heterochromatic, and has structural homogeneity. 

10 Therefore, artificial chromosomes such as the megachromosome offer a 

unique possibility for obtaining information about the amplification process, 
and for analyzing some basic characteristics of the pericentric constitutive 
heterochromatin, as a vector for heterologous DNA, and as a target for 
further fragmentation. 

15 C. Isolation of Artificial Chromosomes 

The artificial chomosomes provided herein can be isolated by any 
suitable method known to those of skill in the art. Also, methods are 
provided herein for effecting substantial purification, particularly of the 
artificial chromosomes. 

20 Artificial chromosomes, may be sorted from endogenous 

chromosomes using any suitable procedures, and typically involve isolating 
metaphase chromosomes, distinguishing the artificial chromosomes from the 
endogenous chromosomes, and separating the artificial chromosomes from 
endogenous chromosomes. Such procedures will generally include the 

25 following basic steps for animal cells and protoplasts: { 1 ) culture of a 

sufficient number of cells (typically about 2 x 10 7 mitotic cells) to yield, 
preferably on the order of 1 x 10 6 artificial chromosomes, (2) arrest of the 
cell cycle of the cells in a stage of mitosis, preferrably metaphase, using a 
mitotic arrest agent such as colchicine, (3) treatment of the cells, particularly 

30 by cell wall dissolution for plant cells and/or swelling of the cells in hypotonic 
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buffer, to increase susceptibility of the cells to disruption, (4) by application 
of physical force to disrupt the cells in the presence of isolation buffers for 
stabilization of the released chromosomes, (5) dispersal of chromosomes in 
the presence of isolation buffers for stabilization of free chromosomes, (6) 
5 separation of artificial chromosomes from endogenous chromosomes and 

(7) storage (and shipping if desired) of the isolated artificial chromosomes in 
appropriate buffers. Modifications and variations of the general procedure 
for isolation of artificial chromosomes, for example to accommodate different 
cell types with differing growth characteristics and requirements and to 

10 optimize the duration of mitotic block with arresting agents to obtain the 

desired balance of chromosome yield and level of debris, may be empirically 
determined (see Examples). 

Steps 1-5 relate to isolation of metaphase chromosomes. The 
separation of artificial from endogenous chromosomes (step 6) may be 

15 accomplished in a variety of ways. For example, the chromosomes may be 
stained with DNA-specific dyes such as Hoeschst 33258 and chromomycin 
A 3 and sorted into artificial chromosomes and endogenous chromosomes on 
the basis of dye content by employing fluorescence-activated cell sorting 
(FACS). 

20 Artificial chromosomes have been isolated by fluorescence-activated 

cell sorting (FACS). This method takes advantage of the nucleotide base 
content of the artificial chromosomes. In the case of predominantly 
heterochromatic artificial chromosomes, by virtue of their high 
heterochromatic DNA content, they will differ from any other chromosomes 

25 in a cell. In a particular embodiment, metaphase chromosomes are isolated 
and stained with base-specific dyes, such as Hoechst 33258 and 
chromomycin A3. Fluorescence-activated cell sorting will separate artificial 
chromosomes from the endogenous chromosomes. A dual-laser cell sorter 
(such as, for example, a FACS Vantage Becton Dickinson Immunocytometry 

30 Systems) in which two lasers were set to excite the dyes separately, allowed 
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a bivariate analysis of the chromosomes by base-pair composition and size. 
Cells containing such artificial chromosomes can be similarly sorted. 

Preparative amounts of artificial chromosomes (for example, 5 x 10 4 - 
5 x 10 7 chromosomes/ml) at a purity of 95% or higher can be obtained. The 
5 resulting artificial chromosomes are used for delivery to cells by methods 
such as, for example, microinjection, liposome-mediated transfer, and 
electroporation. 

Additional methods provided herein for isolation of artificial 
chromosomes from endogenous chromosomes include procedures that are 

10 particularly well suited for large-scale isolation of artificial chromosomes. In 
these methods, the size and density differences between artificial 
chromosomes and endogenous chromosomes are exploited to effect 
separation of these two types of chromosomes. To facilitate larger scale 
isolation of the artificial chromosomes, different separation techiniques may 

15 be employed such as swinging bucket centrifugation (to effect separation 

based on chromosome size and density) [see, e.g., Mendelsohn et aL (1968) 
J_t Mol. Biol. 32:101-1081, zonal rotor centrifugation (to effect separation on 
the basis of chromosome size and density) [see, e.g., Burki et ah (1973) 
Prep. Biochem. 3:1 57-182; Stubblefield et aL (1978) Biochem. Biophvs. Res. 

20 Commun. 83 :1 4Q4-1 41 4, velocity sedimentation (to effect separation on the 
basis of chromosome size and shape) [see e.g., Collard et aL. (1984) 
Cytometry 5:9-191. 

Affinity-, particularly immunoaffinity-, based methods for separation of 
ACs from endogenous chromosomes are also provided herein. For example, 

25 artificial chromosomes which are predominantly heterochromatin may be 
separated from endogenous chromosomes through immunoaffinity 
procedures involving antibodies that specifically recognize heterochromatin, 
and/or the proteins associated therewith, when the endogenous 
chromosomes contain relatively little heterochromatin. 
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Immuno-affinity purification may also be employed in larger scale 
artificial chromosomes isolation procedures. In this process, large 
populations of artificial chromosome-containing cells (asynchronous or 
mitotically enriched) are harvested en masse and the mitotic chromosomes 
5 (which can be released from the cells using standard procedures such as by 
incubation of the cells, such as freshly isolated protoplasts, in hypotonic 
buffer and/or detergent treatment of the cells in conjunction with physical 
disruption of the treated cells) are enriched by binding to antibodies that are 
bound to solid state matrices (e.g. column resins or magnetic beads). 

lO Antibodies suitable for use in this procedure bind to condensed centromeric 
proteins or condensed and DNA-bound histone proteins. For example, 
autoantibody LU851 (see Hadlaczky et aL (1989) Chromosoma 97 :282-288). 
which recognizes mammalian centromeres, may be used for large-scale 
isolation of chromosomes prior to subsequent separation of artificial 

15 chromosomes from endogenous chromosomes using methods such as FACS. 
The bound chromosomes would be washed and eventually eluted for sorting. 

Immunoaffinity purification may also be used directly to separate 
artificial chromosomes from endogenous chromosomes. For example, in the 

20 case of artificial chromosomes that are predominantly heterochromatic, the 
artificial chromsomes may be generated in or transferred to (e.g., by 
microinjection or microcell fusion as described herein) a cell line that has 
chromosomes that contain relatively small amounts of heterochromatin, such 
as hamster cells (e.g., V79 cells or CHO-K1 cells). The predominantly 

25 heterochromatic artificial chromosomes are then separated from the 

endogenous chromosomes by utilizing anti-heterochromatin binding protein 
(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrix 
preferentially binds artificial chromosomes relative to hamster chromosomes. 
Unbound hamster chromosomes are washed away from the matrix and the 

30 artificial chromosomes are eluted by standard techniques. Similarly, artificial 
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chromosomes of one species, e.g., a plant-derived artificial chromosome, 
may be separated from a background of endogenous chromosomes of 
another species, e.g., animal, such as mammalian, chromosomes, based on 
immunological differences of the two species, provided that antibodies that 
5 specifically recognize one species and not the other are available or can be 
generated. 

D- Generation of Artificial Chromosomes Through Assembly of 
Component Elements 

Artificial chromosomes can be constructed in vitro by assembling the 

lO structural and functional elements that contribute to a complete chromosome 
capable of stable replication and segregation alongside endogenous 
chromosomes in cells. The identification of the discrete elements that in 
combination yield a functional chromosome has made possible the in vitro 
assembly of artificial chromosomes. The process of in vitro assembly of 

15 artificial chromosomes, which can be rigidly controlled, provides advantages 
that may be desired in the generation of chromosomes that, for example, are 
required in large, amounts or that are intended for specific use in transgenic 
organism systems. 

For example, in vitro assembly may be advantageous when efficiency 

20 of time and scale are important considerations in the preparation of artificial 
chromosomes. Because in vitro assembly methods do not involve extensive 
cell culture procedures, they may be utilized when the time and labor 
required to transform, feed, cultivate, and harvest cells used in de novo cell- 
based production systems is unavailable. 

25 Provided herein are in vitro assembly methods that include the joining 

of essential components, such as a centromere, telomere and an origin of 
replication, to yield an artificial chromosome, in particular, an artificial 
chromosome that functions in plants and that may contain components 
derived from plant chromosomes. Also provided are artificial chromosomes 

30 produced by the methods. Particular embodiments of the methods and 
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chromosomes include a megreplicator. The megareplicator may contain 
rDNA, for example, mammalian or plant rDNA. In vitro assembled artificial 
chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
5 chromosome may be substantially all heterochromatin, while still containing 
protein-encoding DNA, or may contain increasing amounts of euchromatic 
DNA, such that, for example, it contains about 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90% or greater than about 90% euchromatic DNA. 

In vitro assembly may also be rigorously controlled with respect to the 

10 exact manner in which the several elements of the desired artificial 

chromosome are combined and in what sequence and proportions they are 
assembled to yield a chromosome of precise specifications. This feature is 
of particular significance in the generation of plant artificial chromosomes 
containing one or more regions of segmentation as described herein with 

15 reference to amplification-based artificial chromosomes. For example, certain 
plant chromosome structures (such as acrocentric chromosomes and/or 
chromosomes containing adjacent regions of heterochromatin and rDNA) that 
may be desirable for use in the generation of particular types of plant 
artificial chromosomes via amplification-based methods as described herein 

20 may be limited in number or may not exist. These particular types of plant 
artificial chromosomes, e.g. , certain predominantly heterochromatic plant 
artificial chromosomes, may also be generated via in vitro assembly of 
artificial chromosomes as described herein. 

For example, plant artificial chromosomes containing regions of 

25 repeated nucleic acid units that are predominantly heterochromatic may be 

assembled by joining essential chromosomal components and repeat regions, 
or may be generated from an in vitro assembled artificial chromosome via 
amplification of heterochromatic DNA contained within an in vitro assembled 
artificial chromosome. For generation of such chromosomes via amplification 

30 of heterochromatic DNA contained within an in vitro assembled artificial 
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chromosome, nucleic acids are introduced into a cell containing an in vitro 
assembled artificial chromosome and a resulting cell is selected that contains 
an artificial chromosome containing one or more regions of repeated nucleic 
acid units that are predominantly heterochromatic. The in vitro assembled 
5 artificial chromosome either contains a megareplicator to faciliate 

amplification of chromosomal DNA in connection with integration of nucleic 
acid into the chromosome or megareplicator-containing DNA is included in 
the nucleic acid that is integrated into thee in vitro assembled artificial 
chromosome. 

lO The following describes the processes involved in the assembly of 

artificial chromosomes in vitro, utilizing a megachromosome as exemplary 
starting material. 

1 . Identification and isolation of the components of the artificial 
chromosome 

15 The chromosomes provided herein are elegantly simple chromosomes 

for use in the identification and isolation of components to be used in the in 
vitro assembly of expression systems or artificial chromosomes. The ability 
to purify artificial chromosomes to a very high level of purity, as described 
herein, facilitates their use for these purposes. For example, the 

20 megachromosome, particularly truncated forms thereof, serve as starting 
materials. With respect to the construction of an artificial chromosome 
containing at least some mammalian cell derived components, possible 
starting materials can be obtained from, for example, cell lines such as 1B3 
and mM2C1, which are derived from H1D3 (deposited at the European 

25 Collection of Animal Cell Culture (ECACC) under Accession No. 96040929). 
With respect to the construction of an artificial chromosome containing at 
least some plant cell derived components, possible starting materials include 
cells containing PACs, e.g., megachromosornes, generated as described 
herein. 



WO 02/096923 



PCT/US02/17451 



-93- 

For example, the mM2C1 cell line contains a micro-megachrornosome 
( — 50-60 kB), which advantageously contains only one centromere, two 
regions of integrated heterologous DNA with adjacent rDNA sequences, with 
the remainder of the chromosomal DNA being mouse major satellite DNA. 
5 Other truncated megachromosomes can serve as a source of telomeres, or 
telomeres can be provided. The centromere of the mM2C1 cell line contains 
mouse minor satellite DNA, which provides a useful tag for isolation of the 
centromeric DNA. 

Additional features of particular ACs provided herein, such as the 

lO micro-megachromosome of the mM2C1 cell line, that make them uniquely 
suited to serve as starting materials in the isolation and identification of 
chromosomal components include the fact that the centromeres of each 
megachromosome within a single specific cell line are identical. The ability 
to begin with a homogeneous centromere source (as opposed to a mixture of 

15 different chromosomes having differing centromeric sequences) greatly 
facilitates the cloning of the centromere DNA. By digesting purified 
megachromosomes, particularly truncated megachromosomes, such as the 
micro-megachromosome, with appropriate restriction endonucleases and 
cloning the fragments into commercially available and well known YAC 

20 vectors (see, e.g. . Burke et aL (1987) Science 236 :806-81 2), BAC vectors 
(see, e.g. . Shizuya et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89 : 8794- 
8797 bacterial artificial chromosomes which have a capacity of incorporating 
0.9 - 1 Mb of DNA) or. PAC vectors (the P1 artificial chromosome vector 
which is a P1 plasmid derivative that has a capacity of incorporating 300 kb 

25 of DNA and that is delivered to coli host cells by electroporation rather 
than by bacteriophage packaging; see, e.g. , loannou et aL (1994) Nature 
Genetics 6:84-89; Pierce et aL (1992) Meth. Enzvmol. 216 :549-574: Pierce 
et aL (1992) Proc. NatL Acad. Sci. U.S.A. 89:2056-2060; U.S. Patent No. 
5,300,431 and International PCT application No. WO 92/14819) vectors, it 
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is possible for as few as 50 clones to represent the entire micro- 
megachromosome. 

a. Centromeres 
An exemplary centromere for use in the construction of an artificial 
5 chromosome is that contained within a megachromosome, such as those 
described herein. One example of a particular megachromosome-containing 
cell line provided is, for example, H1D3 and derivatives thereof, such as 
mM2C1 cells. Megachromosomes are isolated from such cell lines utilizing, 
for example, the procedures described herein, and the centromeric sequence 

lO is extracted from the isolated megachromosomes. For example, the 
megachromosomes may be separated into fragments utilizing selected 
restriction endonucleases that recognize and cut at sites that, for instance, 
are primarily located in the replication and/or heterologous DNA integration 
sites and/br in the satellite DNA. Based on the sizes of the resulting 

15 fragments, certain undesired elements may be separated from the 

centromere-containing sequences. The centromere-containing DNA could be 
as large as 1 Mb. 

Probes that specifically recognize centromeric sequences, such as 
mouse minor satellite DNA-based probes [see, e.g. , Wong et aL (1988) Nucl. 

20 Acids Res- 16 :11645-116611, pCT4.2 probe, a 3.5 kb fragment of 
Arabidopsis 5S rDNA (Campbell et al. (1992) Gene 7 72:226-223), 
Arabidopsis cosmids E4. 1 1 (30kb) adn E4.6 (33 kb, Bent et al. (1994) 
Science -255:1856-1860; and 180 bp pAL1 repeat sequence (Maluszynska et 
al. (1991) Plant J. 7:159-166; and Martinez-Zapater et al. (1986) Mol. Gen. 

25 Genet. 204\^ 7-423) may be used to isolate a centromere-containing YAC, 
BAC or PAC clone derived from the megachromosome. Alternatively, or in 
conjunction with the direct identification of centromere-containing 
megachromosomal DNA, probes that specifically recognize the non- 
centromeric elements, such as probes specific for mouse major satellite DNA, 
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plant satellite DIMA, the heterologous DNA and/or rDNA, may be used to 
identify and eliminate the non-centromeric DNA-containing clones. 

Additionally, centromere cloning methods described herein may be 
utilized to isolate the centromere-containing sequence of the 
5 megachromosome. 

Once the centromere fragment has been isolated, it may be sequenced 
and the sequence information may in turn be used in PCR amplification of 
centromere sequences from megachromosomes or other sources of 
centromeres. Isolated centromeres may also be tested for function in vivo by 
10 transferring the DNA into a host cell. Functional analysis may include, for 
example, examining the ability of the centromere sequence to bind 
centromere-binding proteins. The cloned centromere will be transferred to 
cells with a selectable marker gene and the binding of a centromere-specific 
protein, such as anti-centromere antibodies ( e.g. , LU851, see, Hadfaczky et 
15 aL (1986) Exp. Cell Res. 167 :1-15) can be used to assess function of the 
centromeres. 

b. Telomeres 

Telomeres that may be used in assembly of an artificial chromosome 
include a 1 kB synthetic telomere {see, e.g., PCT Application Publication No. 

20 WO 97/40183). A double synthetic telomere construct, which contains a 1 
kB synthetic telomere linked to a dominant selectable marker gene that 
continues in an inverted orientation may be used for ease of manipulation. 
Such a double construct contains a series of TTAGGG repeats 3' of the 
marker gene and a series of repeats of the inverted sequence, i.e., GGGATT, 

25 5' of the marker gene as follows: 

(GGGATTT) n — dominant marker gene — (TTAGGG) n . Using an inverted 
marker provides an easy means for insertion, such as by blunt end ligation, 
since only properly oriented fragments will be selected. 

Telomere sequences also include sequences described in plants, for 

30 example, an Arabidopsis sequence containing head-to-tail arrays of the 
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monomer repeat CCCTAAA totaling a few, for example 3-4, kb in length. 
Telomere sequences vary in length and do not appear to have a strict length 
requirement. An example of a cloned telomere is found in GenBank 
accession no. M20158 (Richards and Ausubel (1988) Cell 5^:127-136) and 
5 in U.S. Patent No. 5,270,201. Yeast telomere sequences include those 
provided in GenBank accession no. S70807 (Louis et al. (1994) Yeast 
10:271-274). Additionally, a method for isolating a higher eukaryotic 
telomere f rom A. thai/ana has been reported (Richards and Ausubel (1988) 
Cell 53:127-136; and U.S. Patent No. 5,270,201). 

lO c. Meg are plica tor 

The megareplicator sequences, such as those containing rDNA, 
provided herein are preferred for use in artificial chromosomes generated by 
assembly of component elements in vitro. The rDNA provides an origin of 
replication and also provides sequences that facilitate amplification of the 

15 artificial chromosome in vivo to increase the size of the chromosome to, for 
example, accommodate increasing copies of a heterologous gene of interest 
as well as continuous high levels of expression of the heterologous genes, 
d. Filler heterochromatin 
Filler heterochromatin, particularly satellite DNA, is included to 

20 maintain structural integrity and stability of the artificial chromosome and 
provide a structural base for carrying genes within the chromosome. The 
satellite DNA is typically A/T-rich DNA sequence, such as mouse major 
satellite DNA, or G/C-rich DNA sequence, such as hamster natural satellite 
DNA. Sources of such DNA include any eukaryotic organisms that carry 

25 non-coding satellite DNA with sufficient A/T or G/C composition to promote 
ready separation by sequence, such as by FACS, or by density gradients. 
Examples of plant satellite DNA include, but are not limited to, satellite DNA 
of soybean (see, e.g., Morgante era/. (1997) Chromosome Res. 5:363-373; 
and Vahedian et af. (1995) Plant Mol. Biol. 23:857-862), satellite DNA on 

30 the rye B chromosome (see, e.g., Langdon et aL (2000) Genetics 154:869- 
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884) and satellite DNA in the Saccharum complex (see, e.g., Alix et at. 
(1998) Genome 4 7:854-864). The satellite DNA may also be synthesized by 
generating sequence containing monotone, tandem repeats of highly A/T- or 
G/C-rich DNA units. 
5 The most suitable amount of filler heterochromatin for use in 

construction of the artificial chromosome may be empirically determined by, 
for example, including segments of various lengths, increasing in size, in the 
construction process. Fragments that are too small to be suitable for use will 
not provide for a functional chromosome, which may be evaluated in cell- 

lO based expression studies, or will result in a chromosome of limited functional 
lifetime or mitotic and structural stability. 

e. Selectable marker 
Any convenient selectable marker, including specific examples 
described herein, may be used and at any convenient locus in the expression 

15 system. x 

2. Combination of the isolated chromosomal elements 
Once the isolated elements are obtained, they may be combined to 
generate the complete, functional artificial chromosome expression system. 
This assembly can be accomplished for example, by in vitro ligation either in 

20 solution, LMP agarose or on microbeads. The ligation is conducted so that 
one end of the centromere is directly joined to a telomere. The other end of 
the centromere, which serves as the gene-carrying chromosome arm, is built 
up from a combination of satellite DNA and megareplicator sequences, e.g. r 
rDNA sequence, and may also contain a selectable marker gene. Another 

25 telomere is joined to the end of the gene-carrying chromosome arm. The 

gene-carrying arm is the site at which any heterologous genes of interest, for 
example, in expression of desired proteins encoded thereby, are incorporated 
either during in vitro assembly of the chromosome or sometime thereafter. 
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3. Analysis and testing of the artificial chromosome expression 
systems 

Artificial chromosomes assembled in vitro may be tested for 
functionality in cell systems, such as plant and animal cells, using any of the 
5 methods described herein for the artificial chromosomes, minichromosomes, 
or known to those of skill in the art. 

4. Introduction of desired heterologous DNA into the in vitro 
assembled chromosome 

Heterologous DNA may be introduced into the in vitro synthesized 

10 chromosome using routine methods of molecular biology, may be introduced 
using the methods described herein for the artificial chromosomes, or may be 
incorporated into the in vitro assembled chromosome as part of one of the 
synthetic elements, such as the heterochromatin. The heterologous DNA 
may be linked to a selected repeated fragment, and then the resulting 

15 construct may be amplified in vitro using the methods for such in vitro 
amplification provided herein. 

In a particular embodiment of these in vitro assembly methods, a site- 
specific recombination site is included in the assembly DNA or is added into 
the assembled chromosome, such as a plant in vitro assemble artificial 

20 chromosome, after initial assembly. The presence of a recombination site in 
the in vitro assembled artificial chromosome facilitates recombinase-catalyzed 
introduction of heterologous nucleic acid into the chromosome if the 
heterologous nucleic acid also contains a complementary recombination site. 
Such recombination systems include, but are not limited to, Cre//ox [see, 

25 e.g., Dale and Ow (1995) Gene 37:79-85], FLP/FRT [see, e.g., Nigel et ai. 

(1995) The Plant Journal 5:637-652], R/ffS [see, e.g., Onouchi et ai. (1991) 
Nuc. Acids Res. 73:6373-6378], Gin/p/x [see, e.g., Maeser and Kahman 
(1991) MoL Gen. Genet. 230:170-176] and int/art. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 

30 integrase recombinase in conjunction therewith to permit engineering of 

natural and artificial chromosomes is desribed in copending U.S. provisional 
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application Serial No. 60/294,758, by Perkins eta/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2001, U.S. 
provisional application Serial No. 60/366,891, by Perkins eta/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on March 21, 2002, U.S. patent 
5 application Serial No. , by Perkins et a/, entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2002, under attorney docket no. 

24601-420, and PCT International Application No. , by Perkins et ah 

entitled "CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, 
under attorney docket no. 24601-420PC, each of which is incorporated 
10 herein in its entirety by reference thereto. Thus, also contemplated herein 
are in vitro assembled artificial chromosomes, in particular such 
chromosomes containing plant chromosome-derived components, that 
contain one or more recombination sites, such as an att site. 

E. Methods for the Production of Plant Acrocentric Chromosomes and 
1 5 Plant Chromosomes Containing Adjacent Regions of rDIMA and 

Heterochromatin 

Acrocentric human and mouse chromosomes in which the short arm 
contains only pericentric heterochromatin, an rDNA array, and telomeres can 
be used in the de novo formation of a satellite DNA based artificial 

20 chromosome (SATAC, also referred to as ACes). In some embodiments of 
the methods of producing a plant artificial chromosome provided herein, it 
may be desirable to introduce heterologous nucleic acids into a plant 
chromosome with arms of unequal length (e.g., into the short arm of an 
acrocentric chromosome) and/or containing adjacent regions of rDNA and 

25 heterochromatin, such as pericentric heterochromatin or satellite DNA. Of 
particular interest in such methods are plant acrocentric chromosomes that 
contain rDNA located adjacent to the pericentric heterochromatin or satellite 
DNA, and, in particular, on the short arm of the chromosome with little to no 
euchromatic DNA between the rDNA and the pericentric heterochromatin. 

30 Utilizing such structures as the initial composition in the generation of plant 
artificial chromosomes may facilitate generation of plant artificial 
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chromosomes that are predominantly heterochromatic. For example, 
introduction of heterologous nucleic acid into a cell containing such an 
acrocentric plant chromosome such that the nucleic acid integrates into the 
pericentric heterochromatin and/or rDNA of the short arm of the chromosome 
5 may be associated with amplification (possibly through "megareplicator" 

DNA sequences such as may reside in plant rDNA arrays, also known as the 
nucleolar organizing regions (NOR)) of heterochromatin that leads to the 
formation of a predominantly heterochromatic plant artificial chromosome. 
Naturally occurring acrocentric plant chromosomes are limited in 

10 number, and plant chromosomes with a structure that includes adjacent 

regions of heterochromatin and rDNA may not exist or may not exist for a 
variety of plant species. Provided herein are methods for generating 
acrocentric plant chromosomes and plant chromosomes containing adjacent 
regions of rDNA and heterochromatin, in particular, pericentric and/or 

1 5 satellite heterochromatin. Further provided herein are methods for generating 
acrocentric plant chromosomes containing adjacent regions of 
heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

Also provided herein are plant acrocentric chromosomes in which the 

20 nucleic acid of one or both arms of the chromosome contains less than about 
50%, or less than about 40%, or less than about 30%, or less than about 
20%, or less than about 10%, or less than about 5%, or less than about 
2%, or less than about 1 %, or less than about 0.5% or less than about 
0.1% euchromatin. In some embodiments of these chromosomes, the 

25 nucleic acid of only one arm, either the short arm or the long arm, contains 
less than these specified amounts of euchromatin. In a particular 
embodiment of these chromosomes, the nucleic acid of the short arm 
contains less these specified amounts of euchromatin. 

Further provided herein are plant chromosomes containing adjacent 

30 regions of heterochromatin, in particular pericentric heterochromatin or 
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satellite DNA, and rDNA with little to no euchromatin between the two 
regions. With reference to such plant chromosomes, "litte to no" means that 
the amount of euchromatic DNA, if any, located between the rDNA and 
heterochromatin (such as pericentric heterochromatin and/or satellite DNA), 
5 generally does not stain diffusely and recognizably as euchromatin and/or 
does not contain protein-encoding genes. Thus, in these chromosomes, 
between the heterochromatin (such as pericentric heterochromatin and/or 
satellite DNA) and the rDNA, there is substantially no chromatin that is less 
condensed than the heterochromatin (e.g., pericentric heterochromatin). The 

10 plant chromosomes containing adjacent regions of rDNA and 

heterochromatin (such as pericentric heterochromatin) provided herein may 
be acrocentric chromosomes. In a particular embodiment of these plant 
chromosomes, the adjacent regions of rDNA and heterochromatin, in 
particular pericentric heterochromatin, are contained on the short arm of the 

15 chromosome. 

Further provided are methods of utilizing such plant chromosomes in 
the generation of plant artificial chromosomes, and, in particular, 
predominantly heterochromatic plant artificial chromosomes, such as ACes 
(also referred to as SATACs). In particular methods of producing plant 

20 artificial chromosomes provided herein, nucleic acids are introduced into a 
cell containing a plant chromosome that is acrocentric and/or contains 
adjacent regions of rDNA and heterochromatin, such as pericentric 
heterochromatin, the cells are cultured through at least one cell division and 
a cell comprising an artificial chromosome, such as a predominantly 

25 heterochromatic artificial chromosome, is selected. In these methods, the 
plant chromosome into which nucleic acid is introduced may be an 
acrocentric chromosome containing adjacent regions of rDNA and 
heterochromatin on the short or long arm, and, in particular, on the short 
arm. 
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The plant chromosomes provided herein can be generated using site- 
specific recombination between plant chromosome regions. The regions may 
be on the same chromosome or separate chromosomes. Through site- 
specific recombination, sections of plant chromosomes may be altered to 
5 remove, invert and/or insert sequences such that a desired plant 

chromosome results. The resulting plant chromosome is acrocentric and/or 
contains adjacent regions of heterochromatic DNA and rDNA, which may or 
may not be on the short arm of an acrocentric chromosome- Thus, the 
starting chromosome in these methods may be a plant chromosome or may 

10 be a plant acrocentric chromosome that does not contain adjacent regions of 
rDNA and heterochromatin, such as pericentric heterochromatin or satellite 
DNA. If the starting chromosome is acrocentric, then it may be used in the 
generation of a plant acrocentric chromosome that contains adjacent regions 
of heterochromatic DNA {e.g. , pericentric heterochromatin and/or satellite 

1 5 DNA) and rDNA, particularly on the short arm of the chromosome, or to 

generate a plant acrocentric chromosome in which the nucleic acid of one or 
both arms contains less than about 50%, or less than about 40%, or less 
than about 30%, or less than about 20%, or less than about 10%, or less 
than about 5%, or less than about 2%, or less than about 1 %, or less than 

20 about 0.5% or less than about 0.1% euchromatin. 

In one of the methods provided herein for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of rDNA 
and heterochromatin, nucleic acid containing a site-specific recombination 
site and nucleic acid containing a complementary site-specific recombination 

25 site are introduced into a cell containing one or more plant chromosomes. 
The nucleic acids may be introduced into the cell sequentially or 
simultaneously. The nucleic acids may also be targeted to particular 
chromosomes and/or particular sequences of a chromosome. Such targeting 
may be accomplished by including in the nucleic acids sequences 

30 homologous to particular sequences in the chromosome(s). 
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The cell is then exposed to a recombinase activity. The recombinase 
activity can be provided by introduction of nucleic acid encoding the activity 
into the cell for expression of the activity therein, or may be added to the cell 
from an exogenous source. The recombinase activity is one that catalyzes 
5 recombination between sequences at the two recombination sites. An 
appropriate recombination event produces a plant chromosome that is 
acrocentric and/or contains adjacent regions of rDNA and heterochromatin 
(such as pericentric heterochromatin and/or satellite DNA) which may be 
readily identif ied therein based on its particular structure (e.g. , arms of 

lO unequal length if the chromosome is acrocentric) and/or other features, e.g., 
the presence of particular added sequences, such as recombination sites and 
DNA encoding a selectable marker, the absence of particular sequences, 
such as excised euchromatic DNA, and the arrangement of sequences, such 
as the placement of rDNA segments adjacent to pericentric heterochromatin 

15 and/or satellite DNA. Such attributes may be detected using techniques 

known in the art for the analysis of nucleic acids and chromosomes, such as, 
for example, In situ hybridization. 

A number of site-specific recombination systems may be used in the 
production of plant chromosomes that are acrocentric and/or contain rDNA 

20 adjacent to heterochromatin, such as pericentric heterochromatin, as 

described herein. Such systems include, but are not limited to, Cre//ox [see, 
e.g., Dale and Ow (1995) Gene 37:79-85], FLP/FRT [see, e.g., Nigel eta/. 
(1995) The Plant Journal 5:637-652], R/RS [see, e.g., Onouchi eta/. (1991) 
Nuc. Acids Res. 70:6373-6378], G\n/gix [see, e.g., Maeser and Kahman 

25 (1991) Mo/. Gen. Genet. 230:170-176] and int/aff. The introduction of ait 
recombination sites into a chromosome and the use of lambda phage 
integrase recombinase in conjunction therewith to permit engineering of 
natural chromosomes is desribed in copending U.S. provisional application 
Serial No. 60/294,758 by Perkins eta/, entitled "CHROMOSOME-BASED 

30 PLATFORMS" filed on May 30, 2001, U.S. provisional application Serial No. 
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60/366,891, by Perkins eta/, entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 

, by Perkins et al. entitled "CHROMOSOME-BASED PLATFORMS" filed 

on May 30, 2002, under attorney docket no. 24601-420, and PCT 



"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601-420PC, each of which is incorporated herein in 
its entirety by reference thereto. These systems, as well as others known in 
the art, can be used to specifically excise or invert DNA (for example, in an 
intrachromosomal recombination), exchange regions of DNA (for example, in 
an inter-chromosomal recombination) or insert DNA (for example, through 
recombination between homologous sequences at a recombination site and 
the DNA to be inserted). The precise event is controlled by the orientation of 
the recombination site DNA sequences. 

In particular embodiments of the methods for producing an acrocentric 
plant chromosome provided herein, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA (in particular, proximal satellite DNA) of one plant chromosome 
in the cell. In a further embodiment, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into the distal end of an arm of a plant chromosome in the 
cell. In these embodiments, recombination between the sites in the presence 
of a recombinase that recognizes the sites can result in deletion of a portion 
of an arm of a chromosome, reciprocal translocation between a distal portion 
of a chromosome arm and a more proximal portion of another chromosome 
arm or reciprocal translocation between pericentric heterochromatin and/or 
satellite DNA of one chromosomal arm and a more distal portion of another 



International Application No. 



, by Perkins et af. entitled 



WO 02/096923 



PCT/US02/17451 



-105- 

chromosome arm. Each of these recombination events can serve to reduce 
the length of a chromosome arm and give rise to an acrocentric 
chromosome. 

In another embodiment, a nucleic acid containing a site-specific 
5 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into the pericentric heterochromatin and/or satellite 
DNA of one plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of an arm of another plant 

lO chromosome in the cell. In this embodiment, recombination between the 

sites in the presence of a recombinase that recognizes the sites can result in 
reciprocal translocation between the pericentric heterochromatin and/or 
satellite DNA of one chromosome and the distal portion of another 
chromosome arm thereby bringing these two regions into close proximity on 

15 one chromosomal arm and reducing the amount of DNA between the 
pericentric region of the arm and the end of the arm to generate an 
acrocentric plant chromosome. 

These methods for producing an acrocentric plant chromosome may 
also be conducted such that nucleic acid containing a site-specific 

20 recombination site is introduced into a cell containing a plant chromosome 

wherein it integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA of a plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of the same arm of the same 

25 chromosome. In this embodiment, recombination between the sites in direct 
{i.e., the same, or head-to-tail) orientation in the presence of a recombinase 
that recognizes the sites can result in intrachromosomal recombination 
between the pericentric heterochromatin (and/or satellite DNA) and the distal 
portion of the chromosomal arm thereby excising DNA between these two 
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regions and reducing the amount of DNA between them to generate an 
acrocentric plant chromosome. 

In particular embodiments of the methods provided herein for 
producing a plant chromosome containing adjacent regions of rDNA and 
5 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
nucleic acid containing complementary recombinase recognition sites for site- 
specific recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into heterochromatin of 
one plant chromosome in the cell. In a further embodiment, nucleic acid 

lO containing complementary recombinase recognitions sites for site-specific 
recombination is introduced Into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into rDNA or a nucleolar 
organizing region (NOR) of a plant chromosome in the cell. In these 
embodiments, recombination between the sites in the presence of a 

15 recombinase that recognizes the sites can result in deletion of DNA between 
a heterochromatic region, such as the pericentric heterochromatin (and/or 
satellite DNA), and rDNA, inversion of DNA that includes heterochromatin or 
rDNA of a plant chromosome or reciprocal translocation between 
heterochromatin of one chromosomal arm and rDNA of another chromosomal 

20 arm. Each of these recombination events can serve to arrange chromosomal 
DNA such that a region of heterochromatic DNA, such as pericentric 
heterochromatin and/or satellite DNA, is adjacent to a region of rDNA on a 
plant chromosome. 

In another embodiment, nucleic acid containing a site-specific 

25 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into heterochromatin, such as, for example, pericentric 
heterochromatin and/or satellite DNA, of one plant chromosome in the cell 
and nucleic acid containing containing a complementary site-specific 
recombination site is introduced into the cell wherein it integrates into rDNA 

30 of another plant chromosome in the cell. In this embodiment, recombination 
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between the sites can result in reciprocal translocation between the 
heterochromatin of one chromosome and the rDNA of another chromosome 
thereby bringing these two regions into close proximity on one plant 
chromosome with little to no euchromatin between them. 
5 These methods for producing a plant chromosome containing adjacent 

regions of heterochromatic DNA and rDNA may also be conducted such that 
nucleic acid containing site-specific recombination sites is introduced into a 
cell containing a plant chromosome wherein it integrates into 
heterochromatin, for example, pericentric heterochromatin and/or satellite 

10 DNA, of a plant chromosome and nucleic acid containing a complementary 
site-specific recombination site is introduced into the cell wherein it 
integrates into rDNA of the same chromosome. In this embodiment, 
recombination between the sites in direct orientation in the presence of a 
recombinase that recognizes the sites can result in intrachromosomal 

15 recombination between heterochromatin, such as pericentric heterochromatin 
(and/or satellite DNA), and rDNA thereby excising DNA, including 
euchromatic DNA, between these two regions. Recombination of the sites in 
indirect (i.e., head-to-head) orientation in the presence of a recombinase can 
result in inversion of DNA between the sites thereby replacing DNA, such as 

20 euchromatin, located between pericentric heterochromatin (and/or satellite 

DNA) and rDNA on the chromosome with rDNA. Thus, in the resulting plant 
chromosome, rDNA is located adjacent to pericentric heterochromatin (and/or 
satellite DNA), and DNA that was present between the pericentric 
heterochromatin (and/or satellite DNA) and the rDNA is located distal to the 

25 rDNA in a position previously occupied by the rDNA. 

In particular embodiments for producing an acrocentric plant 
chromosome containing adjacent regions of heterochromatin, such as 
pericentric heterochromatin (and/or satellite DNA), and rDNA, the short arm 
of the acrocentric chromosome may be generated in the same recombination 

30 event that places the heterochromatin and rDNA regions adjacent to each 
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other or in a separate recombination event. For example, nucleic acid 
containing a site-specific recombination site may be introduced into a cell 
containing one or more plant chromosomes wherein it integrates into the 
pericentric heterochromatin of one plant chromosome and nucleic acid 
5 containing a complementary site-specific recombination site may be 

introduced into the cell wherein it integrates into rDNA that is located at a 
distal portion of another plant chromosome or the same arm of the same of 
the same chromosome. Recombination of the sites in the presence of a 
recombinase can result in intra- or inter-chromosomal recombination that not 

lO only brings the pericentric heterchromatin (and/or satellite DNA) and rDNA 
into close proximity on one chromosomal arm, but also sufficiently reduces 
the length of that arm such that the resulting chromosome is acrocentric. 

If a single recombination event such as this does not generate an 
acrocentric plant chromosome, multiple recombination events may be used to 

15 produce an acrocentric plant chromosome containing adjacent regions of 

heterochromatic DNA and rDNA. For example, nucleic acid containing a site- 
specific recombination site may be introduced into a cell containing one or 
more plant chromosomes wherein it integrates into the pericentric 
heterochromatin (and/or satellite DNA) of one plant chromosome and nucleic 

20 acid containing a complementary site-specific recombination site may be 
introduced into the cell wherein it integrates into rDNA of the same or a 
different plant chromosome. As described abouve, recombination between 
the sites in the presence of a recombinase can result in deletion, inversion or 
reciprocal translocation of DNA to arrange chromosomal DNA such that 

25 pericentric heterochromatin {and/or satellite DNA) is adjacent to a region of 
rDNA on a plant chromosome. In order to reduce the length of the arm of 
the chromosome on which the adjacent regions of heterochromatin and rDNA 
are located, an additional recombination event can be induced by introducing 
nucleic acid containing a site-specific recombination site into a cell containing 

30 this plant chromosome wherein it integrates into a region of the chromosome 
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distal to the rDNA and nucleic acid containing a complementary site-specific 
recombination site into the cell wherein it integrates into the distal end of the 
same chromosome arm or of another plant chromosome arm. Recombination 
between the recognition sites can result in deletion or reciprocal translocation 
5 of DNA to reduce the length of the chromosome arm distal to the rDNA and 
give rise to an acrocentric plant chromosome containing adjacent regions of 
heterochromatin and rDNA on the short arm of the chromosome. 

In each of the aforementioned methods for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of 

10 heterochromatin and rDNA, the nucleic acid containing the two or more 

recombination sites may be introduced simultaneously or sequentially into a 
cell or cells using nucleic acid transfer methods described herein or known in 
the art. The nucleic acids may randomly integrate into plant chromosomes or 
may be targeted for integration into a particular region or site on a plant 

15 chromosome through homologous recombination between sequences in the 
nucleic acid and sequences within the chromosome. The recombinase 
activity may be provided by introduction of nucleic acid encoding an 
appropriate recombinase into the cell for expression therein. The 
recombinase-encoding nucleic acid may be introduced into the cell prior to, 

20 during or after introduction of nucleic acids encoding recombination sites. 

To facilitate identification of cells containing the transferred nucleic 
acids and/or in which a recombination event has occurred, nucleic acid 
encoding a selectable marker may be introduced into the cell. For example, 
one or both of the nucleic acids containing a recombination site may also 

25 contain DNA encoding a selectable marker (e.g., a resistance-encoding 
marker or a reporter molecule) operatively linked to a promoter which is 
oriented such that integration of the nucleic acid into a chromosome places 
the marker DNA between two directly oriented recombination sites on an arm 
of a chromosome. A cell containing the nucleic acid will thus be resistant to 

30 a selection agent or will detectably express a reporter molecule. Exposure of 
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the cell to the appropriate recombinase can result in a recombination event 
that excises the DNA between the two recombination sites, which includes 
DNA encoding the selectable marker. Thus, recombination could be detected 
as loss of reporter molecule expression or decreased resistance to a selection 
5 agent. After exposure to a recombinase, the cells into which nucleic 

acids containing recombination sites have been transferred may be analyzed 
for the presence of acrocentric plant chromosomes using, for example, FISH 
analysis and other chromosome visualization techniques. 

In another method provided herein for producing a plant chromosome 

10 that is acrocentric and/or contains adjacent regions of heterchromatin and 
rDNA, the recombination event or events that lead to formation of the 
chromosome occur through crossing of transgenic plants that contain 
chromosomes which contain complementary site-specific recombination 
sites. Thus, in one embodiment of these methods, nucleic acid containing a 

15 recombination site adjacent to nucleic acid encoding a selectable marker is 

introduced into a first plant cell and a first transgenic plant is generated from 
the first plant cell. Nucleic acid containing a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative 
linkage is introduced into a second plant cell from which a second transgenic 

20 plant is generated. The first and second transgenic plants are crossed to 
obtain one or more plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and a resistant 
plant that contains cells comprising a plant chromosome that is acrocentric 
and/or contains adjacent regions of heterochromatin and rDNA is selected. 

25 In an example of this method, nucleic acids containing site-specific 

recombination sites are introduced into cells of Nicotiana tabacum. The 
nucleic acids are introduced separately by infecting leaf explants with 
Agrobacterium tumefaciens which carries the kanamycin-resistance gene 
(Kan R ). Kanamycin-resistant transgenic plants are generated from the 

30 infected leaf explants. One transgenic plant contains nucleic acid encoding a 
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promoterless hygromycin-resistance gene preceded by a /ox-site specific 
recombination sequence {fox-hpt), the other plant contains a cauliflower 
mosaic virus 35S promoter linked to a /ox sequence and the ere DNA 
recombinase coding region {35S-/ox-cre). The resultant Kan R transgenic 
5 plants are crossed (see, e.g., protocols of Qin et at. (1994) Proc. Nat/. Acad. 
Sci. U.S.A. 97:1706-1710, 1994). Plants in which the appropriate DNA 
recombination event has occurred are identified by hygromycin-resistance. 

The Kan R cultivars initially may be screened, such as by FISH, to 
identify two sets of candidate transgenic plants. One set has one construct 

lO integrated in regions adjacent to the pericentric heterochromatin (and/or 
satellite DNA) on the short arm of any chromosome. The second set of 
candidate plants has the other construct integrated in rDNA, such as the 
NOR region, of appropriate chromosomes. To obtain reciprocal translocation 
both sites must be in the same orientation. Therefore a series of crosses 

15 may be required, marker-resistant plants generated, and FISH analyses 

performed to identify an "acrocentric" plant chromosome or chromosomes 
that contain adjacent regions of heterochromatin. As described above, such 
an acrocentric chromosome may be used for de novo plant artificial 
chromosome formation, particularly predominantly heterochromatic plant 

20 artificial chromosomes. The selection of appropriate plant lines can be done, 
for example, using marker-assisted selection. 

F. Incorporation of Heterologous Nucleic Acids into Artificial 
Chromosomes 

Heterologous nucleic acids can be introduced into artificial 
25 chromosomes during or after formation. Incorporation of particular desired 
nucleic acids into an artificial chromosome during generation thereof may be 
accomplished by including the desired nucleic acids along with the nucleic 
acid encoding a selectable marker and any other nucleic acids used in 
artificial chromosome generation {e.g., targeting sequences that direct the 
30 heterologous nucleic acid to the pericentric region of a chromosome) in the 
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transformation of a cell to initiate amplification and formation of a artificial 
chromosomes. 

Alternatively, heterologous nucleic acids may be incorporated into an" 
artificial chromosome following formation thereof through transfection of a 
5 cell containing the artificial chromosome with the heterologous nucleic acids. 
In general, incorporation of such nucleic acids into the artificial chromosome 
is assured through site-directed integration, such as may be accomplished by 
including nucleic acids homologous or identical to DNA contained within the 
artificial chromosome in with the heterologous nucleic acid when transferring 
10 it to the artificial chromosome. An additional selective marker gene may also 
be included. 

Additionally, introduction of nucleic acids, particularly DNA molecules 
to an artificial chromosome can be accomplished by the use of site-specific 
recombinases as described herein {see, also, copending U.S. provisional 

15 application Serial No. 60/294,758 by Perkins et ai. entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2001, U.S. provisional application 
Serial No. 60/366,891, by Perkins eta/, entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 
, by Perkins et ai. entitled "CHROMOSOME-BASED PLATFORMS" filed 

20 on May 30, 2002, under attorney docket no. 24601-420, and PCT 

International Application No. , by Perkins et a/, entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601-420PC; each of which is incorporated in its 
entirety by reference thereto). Artificial chromosomes can be produced 

25 containing recombinase recognition sequences, to allow the site-specific 

introduction of DNA molecules into the same. Another use for an introduced 
recombinase site is to provide a region for site-specific integration of a new 
trait by the use of recombinase mediated gene insertion. 
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G. Introduction of Artificial Chromosomes into Plant Cells and Recovery 
of Plants Containing Artificial Chromosomes 

Artificial chromosomes can be introduced into plant cells by a variety 
of methods familiar to those skilled in the art. These methods include 
5 chemical and physical methods for introduction of foreign DNA, as well as 
cell culture methods to transfer chromosomes from one cell to another cell. 

Any type of artificial chromosome can be used. Plant artificial 
chromosomes (PACs) can be prepared by the in vivo and in vitro methods 
described herein. PACs can be prepared inside plant protoplasts and then 

10 transferred to other plant species and tissues, in particular to other plant 

protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper et at. (1982) Plant Cell Physiol. 23:451-458; Krens et at. (1982) 
Nature 72-74). PACs can be isolated from the protoplasts in which they 
were prepared, encapsulated into liposomes, and delivered to other plant 

15 protoplasts (Deshayes et at. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs can be isolated and delivered directly to plant protoplasts, plant 
cells, or other plant targets via a PEG-mediated process, calcium phosphate- 
mediated process, electroporation, microinjection, (particle bombardment), 
lipid-rnediated method with or without sonoporation, sonoporation alone, or 

20 any method known in the art as described herein (Haim et at. (1985) Mol. 

Gen. Genet. 199:161-168; Frommera/. (1986) Nature 319:791-793; Fromm 
eta/. (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et at. (1987) 
Nature 327:70; Klein et at. (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 
and International PCT application publication no. WO 91/00358). Plant 

25 artificial chromosomes can also be transferred to other plant species by 
preparation of protoplast-derived plant microcells, and fusion of the 
microcells containing the plant artificial chromosome with plant cells of other 
plant species. 

Mammalian artificial chromosomes (MACs) can be transferred to plant 
30 cells. Mammalian artificial chromosomes are prepared by the in vivo and in 
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v/tro methods described in US Patent Nos. 6,025,155 and 6,077,697, and 
Internationa! PCT application No. WO 97/40183. MACs can be prepared as 
microcells, and the microcells can be fused with plant protoplasts in the 
presence or absence of PEG (Dudits eta/. (1976) Hereditas 82:121-123; 
5 Wiegland et at. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
can be isolated and delivered directly to plant cells, protoplasts, and other 
plant targets using a PEG-mediated process, calcium phosphate-mediated 
process, electroporation, microinjection, lipid-mediated method with or 
without sonoporation, sonoporation alone, or any method known in the art as 

10 described herein and in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the plant transformed plant 
targets can be developed using standard conditions into roots, shoots, 

15 plantlets, or any structure capable of growing into a plant. 

Accordingly, methods for the introduction of artificial chromosomes 
represent the first step in the production of plant cells and whole plants 
containing artificial chromosomes from a variety of sources. 

The ability to introduce genes into plants, such that they are stably 

20 expressed and transmissible from generation to generation, has 

revolutionized plant biology and opens up new possibilities for using plants 
as green factories for the production of commercially useful products as well 
as for other applications described herein. There are several approaches to 
the generation of stably transformed plants, and the adopted approach varies 

25 according to the aims of the project. For introduction of artificial 
chromosomes into plants, a variety of methods may be employed, 
transgenic plants, the transformation process involves the methods of foreign 
DNA delivery to plant host cells, the growth and analysis of transformed 
plant host cells, and the generation and regeneration of transgenic plants 

30 from transformed plant host cells. 
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1 . Introduction of artificial chromosomes into plant host cells 

Numerous methods for producing or developing transgenic plants are 
available to those of skill in the art. The method used is primarily a function 
of the species of plant. Artificial chromosomes containing heterologous 
5 DNA, such as artificial chromosomes prepared by the methods described 
herein, can be introduced into plant host cells, including, but not limited to, 
plant cells and protoplasts, by, for example, non-vector mediated DNA 
transfer processes (see, also copending U.S. application Serial No. 
09/815,979, which describes methods for delivery that can be adapted for 

10 use with plant cells and used with plant protoplasts). 

Non-vector mediated, or direct, gene transfer systems involve the 
introduction of heterologous DNA, in particular artificial chromosomes, into 
host cells, including but not limited to plant cells and protoplasts, without the 
use of a biological vector. The artificial chromosome that is introduced into 

15 these plant host cells can lead to the development of transformed, 
regenerable transgenic plants. The direct gene transfer systems for 
transgenic plants are designed to overcome the barrier to DNA uptake 
caused by the cell wall and the plasma membrane of plant cells. The 
approaches for direct gene transfer include, but are not limited to, chemical, 

20 electrical, and physical methods, which can also be adapted to optimize 
transfer of artificial chromosomes (see, e.g. , Uchimiya et aL (1989) J. of 
Biotech. 12: 1-20 for a review of such procedures, see also, e.g. , U.S. 
Patent Nos. 5,436,392; 5,489,520; Potrykus et aL (1985) Mot. Gen. Genet. 
753:183; Lorz et aL (1985) Mof. Gen. Genet. 799:173; Fromm et a/. (1985) 

25 Proc. NatL Acad. Set. U.S.A. 52:5824-5828; Uchimiya et at. (1986) Mot. 

Gen. Genet. 204:204; Callis eta/. (1987) Genes Dev. 7:1183-2000; Callis et 
a/. (1987) Nuc. Acids Res. 75:5823-5831; Marcotte et at. (1988) Nature 
355:454 and Toriyama et at. (1988) Bio/Technology 5:1072-1074). 
a. Chemical methods 



WO 02/096923 



PCT/US02/17451 



-116- 

Uptake of artificial chromosomes into plant cells, such as protoplasts, 
can be accomplished in the absence or presence of polyethylene glycol 
(PEG), which is a fusogen, or by any variations of such methods known to 
those of skill in the art [see, e.g. , U.S. Patent No. 4,684,61 1 to Schilperoot 
5 etaL; Paskowski et al. (1984) EMBO J. 3:2717-2722; U.S. Patent Nos. 
5,231,019 and 5,453,367]. In one approach, plant protoplasts are 
incubated with a solution of foreign DNA, in particular artificial 
chromosomes, and PEG at a concentration that allows for high cell survival 
and high efficiency chromosome uptake. The protoplasts are then washed 

lO and cultured [Datta and Datta (1999) Meth. in Molecular Biol. 111:335-348]. 
In an alternative approach, plant protoplasts are incubated with artificial 
chromosomes in the presence of calcium phosphate for direct artificial 
chromosome uptake (Haim et aL (1985) Mol. Gen. Genet. 199:161-168). 
Alternatively, the artificial chromosome, in particular plant artificial 

15 chromosome (PAC), is formed in a plant protoplast which is, in turn, fused 
with another plant protoplast in the presence or absence of PEG to transfer 
the PAC to the plant host protoplast. Such methods for treating protoplasts 
with PEG and foreign DNA are well known in the art (Draper etaL (1982) 
Plant Ceil Physiol. 23:451-458; Krens eta/. (1982) Nature 72-74). 

20 Another chemical direct gene transfer method involves lipid-mediated 

delivery of artificial chromosomes to plant protoplasts. In this process, 
liposomes with encapsulated artificial chromosomes are allowed to fuse with 
protoplasts alone or in the presence of PEG as the fusogen to transfer the 
foreign DNA, in particular artificial chromosome, to the plant host protoplast 

25 (Deshayes etaL (1985) EMBO J. 4:2731-2737; Fraley and Paphadjopoulos 
(1982) Curr Top Microbiol Immunol 96:171-191). 

Another direct gene transfer method involves the use of microcells. 
The chromosomes can be transferred by preparing microcells containing 
artificial chromosomes and then fusing the microcells with plant protoplasts. 

30 Methods for the preparation and fusion of microcells with other cells are well 
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known in the art (see Example No. 4 and see also, e.g. . U.S. Patent Nos. 
5,240,840; 4,806,476:5,298,429; 5,396,767; Fournier (1981) Proc. Natl. 
Acad. Sci. U.S.A. 78 :6349-6353; and Lambert et al. (1991) Proc. Natl. 
Acad. Sci. U.S.A. 88 :5907-59; Dudits et al. (1976) Hereditas 82:121-123; 
5 Wiegland et al. (1987) J. Cell. Sci. Pt. 2 145-149). 
b. Electrical methods 
Electroporation, which involves high-voltage electrical pulses to a solution 
containing a mixture of protoplasts or plant cells and foreign DNA, in 
particular artificial chromosomes, to create nanometer-sized, reversible pores, 

10 is a common method to introduce DNA into plant cells or protoplasts. The 
exogenous DNA may be added to the protoplasts in any form such as, for 
example, naked linear, circular or supercoiled DNA, artificial chromosomes 
encapsulated in liposomes, DNA in spheroplasts, artificial chromosomes in 
other plant protoplasts, artificial chromosomes complexed with salts, and 

15 other methods. The foreign DNA, in particular artificial chromosome, can also 
include a phenotypic marker to identify plant cells that are successfully 
transformed. 

When plant cells or protoplasts are subjected to short electrical DC (direct 
current) pulses, they may experience an increase in the permeability of the 

20 plasma membrane and/or cell wall to hydrophilic molecules such as nucleic 
acids, which are normally unable to enter the plant cell directly. Nucleic 
acids are taken directly into the cell cytoplasm either through these pores or 
as a consequence of the redistribution of membrane components that 
accompanies closure of the pores. Certain cell wall-degrading enzymes, such 

25 as pectin-degrading enzymes, may be employed to render the plant target 
recipient cells more susceptible to DNA or artificial chromosome uptake by 
electroporation than untreated cells. Plant recipient cells may also be 
susceptible to transformation by mechanical wounding. To effect 
transformation by electroporation, friable tissues such as a suspension 

30 culture of cells or embryonic callus may be used or immature embryos or 
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other organized tissues may be directly transformed (see, e.g., Fromm et at 
(1986) Nature 379:791-793). Methods for effecting electroporation are well 
known in the art (see, e.g. , U.S. Patent Nos. 4,784,737; 4,970,154; 
5,304,486; 5,501,967; 5,501,662; 5,019,034; 5,503,999; see, also Fromm 
5 £t aL (1985) Proc. Natl. Acad. Sci. U.S.A. 82 :5824-5828; Zimmerman et at. 
(1981) Biophys Biochem Acta 641:160-165; Neuman et at. (1982) EMBO J. 
1:841-845; Riggs et at. (1986) Proc. Nat. Acad. Sci. USA 83:5602-5606; 
Lurquin (1997) Mol. Biotechnol. 7:5-35; Bates (1999) Methods in Molecular 
Biology 111:359-366). Electroporation can be used to introduce nucleic 

lO acids into tobacco mesophyll cells (Morikawa et ah (1986) Gene 41:121- 
124; leaf bases of rice (Dekeyser et at. (1990) Plant Cell 2:591-602; 
immature maize embryos (Songstad et at. (1993) Plant Cell Tiss. Orgn. Cult. 
40:1-15; macerated immature maize embryos (D'Halluin et at. (1992) Plant 
Cell 4:1495-1505; suspension cultured maize cells (Laursen et at (1994) 

15 Plant Mol. Biol. 24: 51-61; and sugar cane (Arencibia et at. (1995) Plant Cell 
Rep. 14:305-309). 

Artificial chromosomes may be delivered to plant cells, in particular 
plant seeds, by the use of electroporation and pollen to derive pollen 
comprising an artificial chromosome. Methods that may be used for delivery 

20 of artificial chromosomes into pollen include, for example, techniques 
described in U.S. Patent No. 5,049,500 and by Negrutiu et at. lin 
Biotechnology and Ecology of Pollen, Mulcahy et at. eds., (1986) Springer 
Verlag, N.Y., pp. 65-69] and Fromm et at. [(1986) Nature 319:791; including 
methods for introducing DNA into mature pollen using various procedures 

25 such as heat shock, PEG and electroporation]. The pollen is capable of 
germinating and fertilizing an egg cell, leading to the formation of a plant 
seed comprising an artificial chromosome, 
c. Physical methods 
The physical methods approach for introducing foreign DNA, in 

30 particular artificial chromosomes , into plant cells overcomes the cell wall 



WO 02/096923 



PCT/US02/17451 



-119- 

barrier to DNA movement. Physical, or mechanical means, are used to 
introduce transgenes directly into protoplasts or plant cells and include, but 
are not limited to, microinjection, particle bombardment, and sonoporation. 

(1) Microinjection 

5 Microinjection involves the mechanical injection of heterologous DNA, 

in particular artificial chromosomes, into plant cells, including cultured cells 
and cells in intact plant organs and embryoids in tissue culture via very small 
micropipettes, needles, or syringes (Neuhaus eta/. (1987)Theor. Appl Genet. 
75:30-36; Reich eta/. (1986) Can. J. Bot. 64:1255-1258; Crossway et a/. 

10 (1986) BioTechniques 4:320-334; Crossway eta/. (1986) MoK Gen. Genet. 
20:179; U.S. Patent No. 4,743,548; silicon carbide whiskers (Kaeppler et 
a/. (1990) Plant Cell Rep. 9:415-418; Frame et at. (1994). For example, 
microinjection of protoplast cells with foreign DNA for transformation of plant 
cells has been reported for barley and tobacco (see, e.g., Holm et al. (20OO) 

15 Transgenic Res. .9:21-32 and Schnorf eta/. Transgenic Res. 7:23-30). Single 
artificial chromosomes may be front-loaded into microinjection needles and 
then injected into cells ("pick-and-inject") following procedures as described 
by Co eta/. [(2000) Chromosome Res. 8:183-191]. 

(2) Particle bombardment 

20 Microprojectile bombardment (acceleration of small high density 

particles, which contain the DNA, to high velocity with a particle gun 
apparatus, which forces the particles to penetrate plant cell walls and 
membranes)have also been used to introduce heterologous DNA into plant 
cells. Microprojectile bombardment techniques for the introduction of nucleic 

25 acids into plant cells, in addition to being an effective means of reproducibly 
stably transforming plant cells, particularly monocots, do not require isolation 
of protoplasts or susceptibility of the host cell to Agrobacterium infection. In 
these methods, nucleic acids are carried through the cell wall and into the 
cytoplasm on the surface of small, typically metal, particles (see, e.g., Klein 

30 eta/. (1987) Nature 327:70; Klein eta/. (1988) Proc. Natl. Acad. Sci. U.S.A. 
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55:8502-8505, Klein et af. in Progress in Pfant Ceifuiar and Molecular 
Biology, eds. Nijkamp, H.J. J., Van der Pias, J.H.W., and Van Aartrijk, J., 
Kluwer Academic Publishers, Dordrecht, (1988), p. 56-66 and McCabe et af. 
(1988) Bio/Technology 6:923-926; Sautter et af. (1991) Biol. Technol. 
5 9:1080-1085; Gordon-Kamm et aL (1990) Plant Cell 2:603-618; Finer et at. 
(1999) Curr. Top. Microbiol. Immunol. 240:59-80; Vasil and Vasil (1999) 
Methods in Molecular Biology 111:349-358; Seki et aL (1999) Mo. 
Biotechnol. 1 1 :25 1-255). Particles may be coated with nucleic acids and 
delivered into cells by a propelling force. Exemplary particles include those 

10 containing tungsten, gold or platinum, as well as magnesium sulfate crystals. 
The metal particles can penetrate through several layers of cells and thus 
allow the transformation of cells within tissue explants. 

In an illustrative embodiment (see, e.g. , U.S. Patent No. 6,023,013) of 
a method for delivering foreign nucleic acids into plant cells, e.g., maize 

15 cells, by acceleration, a Biolistics Particle Delivery System may be used to 
propel particles coated with DNA or cells through a screen, such as a 
stainless steel or Nytex screen, onto a filter surface covered with plant (e.g., 
corn) cells cultured in suspension. The screen disperses the particles so that 
they are not delivered to the recipient cells in large aggregates. The 

20 intervening screen between the projectile apparatus and the cells to be 

bombarded may reduce the size of projectile aggregates and may contribute 
to a higher frequency of transformation by reducing damage inflicted on the 
recipient cells by projectiles that are too large. 



25 filters or solid culture medium. Alternatively, immature embryos or other 

plant target cells may be arranged on solid culture medium. The cells to be 
bombarded are typically positioned at an appropriate distance below the 
microprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 



For the bombardment, cells in suspension may be concentrated on 
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The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment 
are important in this technology. Physical factors include those that involve 
5 manipulating the DNA/microprojectile precipitate or those that affect the 

flight and velocity of either the macro- or microprojectiles. Biological factors 
include all steps involved in manipulation of cells before and immediately 
after bombardment, the osmotic adjustment of target cells to help alleviate 
the trauma associated with bombardment, and also the nature of the 

10 transforming nucleic acid, such as linearized DNA, intact supercoiled 
plasmid?, or artificial chromosomes. 

Physical parameters that may be adjusted include gap distance, flight 
distance, tissue distance and helium pressure. In addition, transformation 
may be optimized by adjusting the osmotic state, tissue hydration and 

15 subculture stage or cell cycle of the recipient cells. Ballistic particle 

acceleration devices are available from Agracetus, Inc. (Madison, Wl) and 
BioRad (Hercules, CA). 

Techniques for transformation of A188-derived maize line using 
particle bombardment are described in Gordon-Kamm et aL (1990) Plant Cell 

20 2:603-618 and Fromm et aL (1990) Biotechnology 5:833-839. 

Transformation of rice may also be accomplished via particle bombardment 
(see, e.g., Christou et aL (1991) Biotechnology 5:957-962). Particle 
bombardment may also be used to transform wheat (see, e.g., Vasil et aL 
(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

25 term regenerable callus; and Weeks et aL (1993) Plant Physiol. 1O2:'\077- 
1 084 for transformation of wheat using particle bombardment of immature 
embryos and immature embryo-derived callus). The production of transgenic 
barley using bombardment methods is described, for example, by Koprek et 
aL (1996) Plant Sci. 1 75:79-91. 

30 (3) Sonoporation 
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Foreign DNA, in paticular artificial chromosomes, may be introduced 
into plant protoplasts using ultrasound treatment, in particular mild 
ultrasound treatment (10-IOOkHz), to create pores for DNA uptake (see e.g. 
International PCT application publication no. WO 91/00358) or may be 
5 introduced into plant protoplasts via a sonoporation machine (ImaRx 
Pharmaceutical Corp., Tucson, AZ). 

Alternatively, the delivery of artificial chromosomes into plant host 
cells is performed by any method described herein or well known in the art. 
For example, needle-like whiskers (US 5,302,523, 1994, US 5,464,765) 

10 have been used to delivery foreign DNA. 

Suitable plant targets into which foreign DNA, in particular artificial 
chromosomes, is transferred include, but are not limited to, protoplasts, cell 
culture cells, cells in plant tissue, meristem cells, microspores, callus, pollen, 
pollen tubes, microspores, egg-cells, embryo-sacs, zygotes or embryos in 

15 different stages of development, seeds, seedlings, roots, stems, leaves, 
whole plants, algae, or any plant part capable of proliferation and 
regeneration of plants, (see, e.g., U.S. Patent Nos. 5,990,390; 6,037,526 
and 5,990,390). The growth of the transformed plant targets described 
herein can done with tissue-culture or non-tissue culture methods, with the 

20 preferred methods being tissue culture methods. 

All plant cells into which foreign DNA, in particular artificial 
chromosomes, are introduced and that is regenerated from the transformed 
cells are used directly for expressed purposes (e.g. herbicide resistance, 
insect/pest resistance, disease resistance, environmental/stress resistance, 

25 nutrient utilization, male sterility, improved nutritional content, production of 
chemicals or biologicals, non-protein expressing sequences, and preparation 
and screening of libraries) as described herein or are used to produce 
transformed whole plants for the applications and uses described herein. The 
particular protocol and means for the introduction of the artificial 
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chromosome into the plant host is adapted or refined to suit the particular 
plant species or cultivar. 

Chromosomes may be transferred to cells by microcell mediated 
chromosome transfer (MMCT) (Tefenius et al., Chromosome Research 7:3-7, 
5 1999; Ramulu et al.. Methods in Molecular Biology 111: 227-242, 1999). In 
general, donor plant cultures or donor mammalian cell cultures are incubated 
in media supplemented with reagents that inhibit DNA synthesis (e.g., 
hydroxy urea, aphidicolin) and/or reagents that inhibit attachment of 
chromosomes to the mitotic spindle (e.g.,colcemid, colchicines, amiprophos- 

lO methyl, cremart). The cell walls of plant cells are digested with enzymes 
(e.g., cellulase, maceroenzyme) producing protoplasts. Donor plant 
protoplasts or donor mammalian cells are loaded on a PercoU gradient in the 
presence of cytochalasin-B (which causes the cell cytoskeleton to 
depolymerize into monomer protein subunits) and centrifuged at 10 5 x g. 

15 During centrifugation the metaphase chromosomes are extruded through the 
plasma membrane forming plant 'microprotoplasts' or mammalian 
'microcells.' The microprotoplasts/microcells are filtered through nylon 
sieves of decreasing pore size (8-3 /vm) to isolate smaller ones that contain 
predominately 1 metaphase chromosome. The microprotoplasts/microcells 

20 are fused to recipient plant protoplasts or mammalian cells by polyethelene 
glycol (peg) treatment. The fusion mixture is cultured in appropriate media. 
If the chromosome of interest is expressing a selection marker gene the 
fusion mixtures may be cultured in appropriate media supplemented with the 
appropriate selection drug (e.g. hygromycin, kanamycin). 

25 2. The growth of transformed plant host cells 

In tissue culture methods, plant cells or protoplasts transformed by the 
chemical, physical, electrical methods described herein are grown, or 
cultured, under selective conditions. The selective markers are integrated 
into the heterologous DNA, in particular artificial chromosome, before its 

30 introduction to plant hosts or are integrated into the plant host after 
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transfection. An additional marker can be used for double selection. 
Generally, the plant cells or protoplasts are grown for numerous generations, 
after which the transformed cells are identified. 

The transformed cells are subjected to conditions known in the art for 
5 calfus initiation. Tissue that develops during the initiation period is placed in 
a regeneration or selection medium where shoot and root development occur. 
The plantlets are analyzed for the determination of transformation 
(International PCT application publication no. WO 00/60061). In the case of 
maize, embryonic callus cultures are initiated from immature maize embryos, 

10 bombarded with genes, and transformed into plantlets by the methods 

described in International PCT application publication no. WO 00/60061. In 
tissue culture methods, Rice calli are transformed with DNA encoding 
insecticidal proteins CrylA(b) and CrylA(c) for insect resistance. Common 
tissue culture methods can also be used to transform tobacco and tomato 

15 (see, e.g. , US Patent No. 6,136,320), embryogenic maize calli (US Pat. Nos. 
5,508,468; 5,538,877; 5,538,880; 5,780,708; 6,013,863; 5,554,798; 
5,990,390; and 5,484,956;) and other crop species, e.g., potato and 
tobacco (Sijmons et af. (1990) Bio/Technol 8:21 7-221 ; tobacco 
(Vanderkerckhove et al. (1989) Bio/Technol 7:929-932 and Owen and Pen 

20 eds. Transgenic Plants: A Production System for Industrial and 

Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996) and rice 
(Zhu et al. (1994) Plant Cell Tiss Org Cult 36:197-204). 
3. Analysis of transformed plant host cells 

Once foreign DNA, in particular artificial chromosomes, is introduced 
25 into plant hosts and the cells or protoplasts are grown and developed under 
the conditions described herein, the plant cells or protoplasts which were 
transformed with artificial chromosomes are identified. The plant cell, 
protoplast, callus, leaf disc, or other plant target are screened for the 
presence of artificial chromosomes by various methods well known in the art 
30 including, but not limited to, assays for the expression of reporter genes. 
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PCR of the isolated plant chromosomes or DNA, electron microscopy, 
visualization methods, and in situ hybridization of chromosome painting 
probe as described herein. Moreover, cells treated with artificial 
chromosomes are isolated during metaphase using a mitotic arrest agent, 
5 such as colchicine, and the artificial chromosome are distinguished from 
endogenous chromosomes by fluorescence-activated cell sorting, size and 
density differences, or by any method well known in the art. Alternatively, 
when a selectable marker gene is transmitted with or as part of the artificial 
chromosome, selective agents are used to detect the expression of the 

lO selectable marker (International PCT application publication no. WO 

00/60061; US Patent No. 6,136,320; Owen and Pen Eds. Transgenic Plants: 
A Production System for Industrial and Pharmaceutical Proteins). Enzymatic 
assays, immunological assays, bioassays, germination assays, or chemical 
assays are used to assess the phenotypic effects of artificial chromosomes 

15 such as insect or fungal resistance or any other expression of genes in 

artificial chromosomes (Cheng et al. (1998) 95:2767-2772; US Patent No. 
6,126,320; International PCT application publication no. WO 00/60061; 
Owen and Pen eds. Transgenic Plants: A Production System for Industrial 
and Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996). The 

20 plant cells, protoplasts, or other plant hosts that are successfully transformed 
with artificial chromosomes are used directly to express the gene of interest 
or are used to generate transgenic plants. 

Fluorescent in situ hybridization (FISH) may be used to screen for the 
transfer of artificial chromosomes into plant cells. Using DNA probes specfic 

25 for the artificial chromosome (e.g., mouse major satellite DNA probe for 
murine satellite DNA based artificial chromosomes; or a kanamycin, 
hygromycin or GUS gene DNA probe for a plant artificial chromosome 
carrying such a gene) standard FISH techniques for plant cells have been 
described (de Jong et al., Trends in Plant Science 4: 258-263, 1999). 
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IdU labeling can be used to determine the optimum conditions for 
chromosome transfer (microcells) or isolated artificial chromosomes. The 
incorporated IdU increases the fragility of the chromosome and will increase 
the probability of cellular mutation. Hence, the cells are fixed within 48- 
5 hours after transf ection/f usion and analyzed for chromosome uptake using 
various procedures. Once the optimum transfer conditions have been 
determined, long-term expression experiments are performed with unlabeled 
artificial chromosomes or microcells. 
H. Re-generation of transgenic plants 

10 Plants containing artificial chromosomes are generated from plant 

cells, protoplasts, calli, or other plant tissue targets into which foreign DNIA, 
in particular artificial chromosomes, have been introduced. Regeneration 
techniques for many commercially important plant species are well-known in 
the art. The artificial chromosome that is inserted into plant hosts to 

15 produce transgenic plants are PACs or MACs. 

Plants are re-generated by the planting of transformed roots, plantlets, 
seeds, seedlings and structures capable of growing into a whole plant 
capable of reproduction (see, e.g., US Patent Nos. 6,136,320 and 
International PCT application No. WO 00/60061). The re-generation of maize 

20 plants from transformed protoplasts is found, for example, in European 
Patent Application nos. 0 292 435 and 0 392 225 and International PCT 
Application Publication no. WO 93/07278; the regeneration of rice following 
gene transfer is found in Zhang et af. (1988) Plant Cell Rep. 7:379-384; 
Shimamoto eta/. (1989) Mature 338:274-277; Datta eta/. (1990) 

25 Biotechnology 5:736-740; and the re-generation of fertile transgenic barley 
by direct DNA transfer to protoplasts is described by Funatsuki eta/. (1995) 
Theor. Appf. Genet. £7:707-712. Alternatively, plants containing artificial 
chromosomes are obtained by crossing a plant containing an artificial 
chromosome with another plant to produce plants having an artificial 

30 chromosome in their genomes (see e.g. US Patent No. 6,150,585). 
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Plants containing an artificial chromosome are propagated through 
seed, cuttings, or vegetatively. The seed from plants containing an artificial 
chromosome are grown in the field, in pots, indoors, outdoors, in 
greenhouses, on glass, or in or on any suitable medium, and the resulting 
5 sexually mature transgenic plants are self-pollinated to generate true breeding 
plants. The progeny from these transgenic plants become true breeding lines 
(International PCT application publication Nos. WO O0/60O61 and EP 
1017268; US Patent Nos. 5,631,152; 5,955,362; 6,015,940; 6,013,523; 
6,096,546; 6,037,527; 6,153,812; Weissbach and Weissbach (1988) 
10 Methods for Plant Molecular Biology, Academic Press, Inc.; Fromm eta/. 
(1990) Bio/Technology 8:833-839; Gordon-Kamm et al. (1990) Plant Cell 
2:603-608; KoEiel era/. (1993) Bio/Technology 11:194-200; and Golovkin et 
ah (1993) Plant Sci. 90:41-52). 
I.PACs 

15 Plant artificial chromosomes (PACs) are prepared by the in vivo and in 

vitro methods described herein. PACs may be prepared inside plant 
protoplasts and then transferred to plant targets, in particular to other plant 
protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper et ah (1982) Plant Cell Physiol. 23:451-458; Krens et ah (1982) 

20 Nature 72-74). PACs are isolated from the protoplasts in which they were 
prepared, encapsulated into liposomes, and delivered to other plant 
protoplasts (Deshayes et ah (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs are isolated and delivered directly to plant protoplasts, plant cells, 
or other plant targets via a PEG-mediated process, calcium phosphate- 

25 mediated process, electroporation, microinjection, sonoporation, or any 

method known in the art as described herein (Haim et ah (1985) Mol. Gen. 
Genet. 199:161-168; Fromm et ah (1986) Nature 31 9:791-793; Fromm et 
ah (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et ah (1987) 
Nature 327:70; Klein et ah (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 

30 and International PCT application publication no. WO 91/00358). 
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2. MACs 

Mammalian artificial chromosomes (MACs) are prepared by the in vivo 
and in vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCX application No. WO 97/40183. MACs are prepared as 
5 microcells, and the rnicrocells are fused with plant protoplasts in the 

presence or absence of PEG (Dudits et al. (1976) Hereditas 82:121-123; 
Wiegland et ai. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
are isolated and delivered directly td plant cells, protoplasts, and other plant 
targets a PEG-mediated process, calcium phosphate-mediated process, 

lO electroporation, microinjection, sonoporation , or any method known in the 
art as described herein and in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the transformed plant 

15 targets are developed using standard conditions into roots, shoots, plantlets, 
or any structure capable of growing into a plant. Transgenic plants can, in 
turn, be generated by the planting of transformed roots, plantlets, seeds, 
seedlings and structures capable of growing into a plant. Transgenic 
plants can be propagated, for example, through seed, cuttings, or vegetative 

20 propagation. 

I. Applications and Uses of Artificial Chromosomes 

Artificial chromosomes provide convenient and useful vectors, and in 
some instances (e.g. , in the case of very large heterologous genes) the only 
vectors, for introduction of heterologous genes into hosts. Virtually any 

25 gene of interest is amenable to introduction into a host via artificial 
chromosomes. 

As described herein, there are numerous methods for using artificial 
chromosomes to introduce coding sequences into plant cells. These include 
methods for using artificial chromosomes to express genes encoding 
30 commerically valuable enzymes and therapeutic compounds in plant cells, 
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introduction of agronomically important traits or applications related to the 
manipulation of large regions of DNA. 

The artificial chromosomes provided herein may be used in methods of 
protein and gene product production, particularly using plant cells as host 
5 cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 
medicine and industry. They are also intended for use in methods of gene 
therapy and for production of transgenic organisms, particularly plants 
10 (discussed above, below and in the EXAMPLES). 

1 . Production of products in plants 

Methods for expression of heterologous proteins in plant cells 
("molecular farming") are provided. At present, many foreign proteins have 
been expressed in whole plants or selected plant organs. Plants can offer a 

15 highly effective and economical means to produce recombinant proteins as 
they can be grown on a large scale at modest cost. The production of 
heterologous proteins in plants has included genes that are fused to strong 
constitutive plant promoters (e.g., 35S from cauliflower mosaic virus 
(Sijmons et al., 1 990, Bio/Technology, 8:217-221, Benfey and Chua, US 

20 5,1 10,732, Fraley et al., US 5,858,742, McPherson and Kay, US 

5,359,142); seed specific promoters (Hall et al., US 5,504,200, Knauf et al., 
US 5,530,194, Thomas et al., US 5,905,186, Moloney, US 5,792,922, US 
5,948,682) or promoters active in other plant organs such as fruit (Radke et 
al., 1988, Theoret. Appl. Genet., 75:685-694, Best wick et al., US 

25 5,783,394, Houck and Pear, US 4,943,674) or storage organs such as 

tubers (Rocha-Sosa et al., US 5,436,393, US 5,723,757). The genes under 
the control of these promoters can be any protein and include, for example, 
genes that encode receptors, cytokines, enzymes, proteases, hormones, 
growth factors, antibodies, tumor suppressor genes, vaccines, therapeutic 

30 products and multigene pathways. 
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For example, industrial enzymes that can be produced include, for 
example, a-amylase, glucanase, phytase and xylanase (see, Goddijn and Pen 
(1995) Trends Biotechnol. 75:379-387; Pen et ah (1992) Bio/Technology 
70:292-293; Horvath et ai. (2OO0) Proc. NatL Acad. Sci. U.S.A. 37:1914- 
5 1919; and e.g., Herbers and Sonnewald (1996) in Transgenic Piants: A 

Production System for industriai and Pharmaceuticai Proteins'* Owen and Pen 
Eds., John Wiley & Sons, West Sussex, England), proteases such as 
subtilisin and other industrially important enzymes. Additional proteins that 
can be produced in crops by molecular farming include other industrial 

lO enzymes, for example, proteases, carbohydrate modifying enzymes such as 
glucose oxidase, cellulases, hemicellulases, xylanases, mannanases or 
pectinases, (e.g. Baszczynski et ah, US 5,824,870, US 5,767,379, Bruce et 
al., US 5,804,694). Additionally, the production of enzymes particularly 
valuable in the pulp and paper industry such as ligninases or xylanases also 

15 can be expressed, (Austin-Philips et al., US 5,981,835). Other examples of 
enzymes include phosphatases, oxidoreductases and phytases, (van Ooijen 
et al., US 5,714,474). 

Additionally, expression and delivery of vaccines in plants has been 
proposed(Arntzen and Lam, US 6,136,320, US, 5,914,123, Curtiss and 

20 Cardineau, US 5,679,880, US 5,679,880, US 5,654,184, Lam and Arntzen, 
US 5,612,487, US 6,034,298, Rymerson et al., W09937784A1, as well as 
antibodies (Conrad et al., WO 972900A1, Hein et al., US 5,959,177, Hiatt 
and Hein, US 5,202,422, US 5,639,947, Hiatt et al., US 6,046,037), 
peptide hormones (Vandekerckhove, J.S., US 5,487,991, Brandle et al., 

25 W09967401 A2), blood factors and similar therapeutic molecules. 

Expression of vaccines in edible plants can provide a means for drug delivery 
which is cost effective and particularly suited for the administration of 
therapeutic agents in rural or under developed countries. The plant material 
containing the therapeutic agents could be cultivated and incorporated into 

30 the diet (Lam, D.M., and Arntzen, C.J., US 5,484,719). Similarly, plants 
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used for animal feed can be engineered to express veterinary biologies that 
can provide protection against animal disease, (Rymerson et aL, 
W09937784AT). Antibodies also can be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
5 (scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 

Bio/Technology 73:1090-1093) and IgG (Ma et aL (1995) Science 26*3:716- 
719). Monoclonal antibodies for therapeutic and diagnostic applications are 
of particular interest. 

Examples of human biopharmaceuticals that can be expressed in 

10 plants include, but are not limited to, albumin (Sijmons et aL (1990)), 

enkephalins (Vandekerckhove et aL (1989) ), interferon-a (Zhu eta/. (1994) 
and GM-CSF (Ganz et aL (1996) in Transgenic Plants: A Production System 
for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et aL (1998) in 

15 Methods in Biotechnology, VoL 3: Recombinant Proteins from Plants: 

Production and isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

Cells containing the artificial chromosomes provided herein can 
advantageously be used in in vitro plant cell-based systems for production of 

20 proteins, particularly several proteins from one cell line, such as multiple 
proteins involved in a biochemical pathway or multivalent vaccines. The 
genes encoding the proteins are introduced into the artificial chromosomes 
which are then introduced into plant cells. Plant cells useful for this purpose 
are those that grow well in culture, or most preferably, plant cells capable of 

25 being regenerated to whole plants. Plants can then be cultivated by common 
methods to produce plant material comprising said heterologous proteins. 
The heterologous proteins can be subject to purification or the plant tissue or 
extracts thereof can be used directly for vaccination, amelioration of disease, 
or processing of material, such as bleaching during pulp and paper 

30 processing or enzymatic conversion of industrial materials or feedstocks. 



WO 02/096923 



PCT/US02/17451 



-132- 

Alternatively, the heterologous gene(s) of interest are transferred into a 
production cell line or plant line that already contains artificial chromosomes 
in a manner that targets the gene(s) to the artificial chromosomes. The cells 
or plants are grown under conditions whereby the heterologous proteins are 
5 expressed. Because the proteins are expressed at high levels in a stable 

permanent extra-genomic chromosomal system, selective conditions are not 
required. 

Selection of host lines for use in artificial chromosome-based protein 
production systems is within the skill of the art, but often will depend on a 

10 variety of factors, including the properties of the heterologous protein to be 
produced, potential toxicity of the protein in the host cell, any requirements 
for post-translational modification (e.g., glycosylation, amination, 
phosphorylation) of the protein, transcription factors available in the cells, 
the type of promoter element(s) being used to drive expression of the 

15 heterologous gene, whether production is completely intracellular or the 
heterologous protein will preferably be secreted from the cell, or be 
sequestered or localized, and the types of processing enzymes in the cell. 

Artificial chromosomes can be engineered as platforms for the 
production of specific molecules in plant cells. For example, production of 

20 complex mammalian molecules, such as multichain antibodies, requires a 
number of protein activities not normally found in plant species. It is 
possible to produce an artificial chromosome that comprises all of the 
mamalian activities needed to produce human antibodies, correctly modified 
and processed, by introducing into an artificial chromosome the genes 

25 needed to carry out these activities. Said genes would be modified, for 

example, by placing each gene under the control of a plant promoter, or by 
placing the master control gene, i.e., a gene that controls expression of the 
various genes, under the control of a plant promoter. Alternatively, 
mammalian transcriptional control factors could be introduced, under the 
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control of plant active promoters, to be expressed in a plant cell and cause 
the expression of said target proteins, for example multichain antibodies. 

In this fashion, plant artificial chromosomes are developed, each 
capable of supporting the efficient production of a specific class of valuable 
5 products, for example, antibodies, blood clotting factors, etc. Thus, 

production of products within a class, for example, human antibodies would 
simply involve the introduction of a specific antibody coding sequence, 
without modification into the artificial chromosome engineered specifically for 
the production of human antibodies. The artificial chromosome would 
10 comprise all of the required genetic activities for the proper expression, 

translation and post-translational modification of human antibodies. Such 
artificial chromosomes can be used in a variety of applications, such as, but 
are not limited to, large scale production of numerous specific human 
antibodies. 

15 Advantages of plant cells as host cell lines in the production of 

recombinant proteins include, but are not limited to, the following: (1) 
proteins are post-translationally modified similar to mammalian systems, (2) 
plants can be directed to secrete proteins into stable, dry, intracellular 
compartments of seeds called endosperm protein bodies, which can easily be 

20 collected, (3) the amount of recombinant product that can be produced 

approaches industrial scale levels and (4) health risks due to contamination 
with potential pathogens/toxins are minimized. 

The artificial chromosome-based system for heterologous protein 
production has many advantageous features. For example, as described 

25 above, because the heterologous DNA is located in an independent, extra- 
genomic artificial chromosome (as opposed to randomly inserted in an 
unknown area of the host cell genome or located as extrachromosomal 
element(s) providing only transient expression), it is stably maintained in an 
active transcription unit and is not subject to ejection via recombination or 

30 elimination during cell division. Accordingly, it is unnecessary to include a 
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selection gene in the host cells and thus growth under selective conditions is 
also unnecessary. Furthermore, because the artificial chromosomes are 
capable of incorporating large segments of DNA, multiple copies of the 
heterologous gene and linked promoter element(s) can be retained in these 
5 chromosomes, thereby providing for high-level expression of the foreign 

protein(s). Alternatively, multiple copies of the gene can be linked to a single 
promoter element and several different genes can be linked in a fused 
polygene complex to a single promoter for expression of, for example, all the 
key proteins constituting a complete metabolic pathway (see, e.g. . Beck von 

10 Bodman et aL (1995) Biotechnology 1_3:587-591 ). Alternatively, multiple 
copies of a single gene can be operatively linked to a single promoter, or 
each or one or several copies can be linked to different promoters or multiple 
copies of the same promoter. Additionally, because artificial chromosomes 
have an almost unlimited capacity for integration and expression of foreign 

15 genes, they can be used not only for the expression of genes encoding end- 
products of interest, but also for the expression of genes associated with 
optimal maintenance and metabolic management of the host cell, e.g., genes 
encoding growth factors, as well as genes that facilitate rapid synthesis of 
correct form of the desired heterologous protein product, e.g. , genes 

20 encoding processing enzymes and transcription factors as described above. 

The artificial chromosomes are suitable for expression of any proteins 
or peptides, including proteins and peptides that require in vivo 
posttranslational modification for their biological activity. Such proteins 
include, but are not limited to antibody fragments, full-length antibodies, and 

25 multimeric antibodies, tumor suppressor proteins, naturally occurring or 
artificial antibodies and enzymes, heat shock proteins, and others. 

Thus, such cell-based "protein factories" employing artificial 
chromosomes can be generated using artificial chromosomes constructed 
with multiple copies (theoretically an unlimited number or at least up to a 

30 number such that the resulting artificial chromosome is about up to the size 
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of a genomic chromosome (i.e., endogenous)) of protein-encoding genes with 
appropriate promoters, or multiple genes driven by a single promoter, i.e., a 
fused gene complex (such as a compfete metabolic pathway in plant 
expression system; see; e.g. . Beck von Bodman (1995) Biotechnology 
5 1_3:587-591 ). Once such an artificial chromosome is constructed, it can be 
transferred to a suitable plant species capable of being propagated under 
field conditions, or under conditions that permit the recovery of the intended 
product. Plant cell cultures such as algae can be used in a system analogous 
to mammalian cell culture systems. The advantage of plant based systems 

10* such as this include low input costs for growth, rapid growth rates and 
ability to produce a large biomass economically. 

The ability of artificial chromosomes to provide for high-level 
expression of heterologous proteins in host cells is demonstrated, for 
example, by analysis of mammalian cells containing a mammalian artificial 

15 chromosome, H1D3 and G3D5 cell lines described herein. Northern blot 
analysis of mRNA obtained from these cells reveals that expression of the 
hygromycin-resistance and fi -galactosidase genes in the ceUs correlates with 
the amplicon number of the megachromosome(s) contained therein. 

Transgenic plants producing these compounds are made by the 

20 introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 
intermediary metabolites, carbohydrate polymers, enzymes for uses in 

25 bioremediation, enzymes for modifying pathways that produce secondary 

plant metabolites such as f lavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 
plastics. The compounds are roduced by the plant, extracted upon harvest 

30 and/or processing, and used for any presently recognized useful purpose 
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such as pharmaceuticals, fragrances, and industrial enzymes. Alternatively, 
plants produced in accordance with the methods and compositions provided 
herein can be made to metabolize certain compounds, such as hazardous 
wastes, thereby allowing bioremediation of these compounds. 
5 The artificial chromosomes provided herein can be used in methods of 

protein and gene product production, particularly using plant cells as host 
cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 

10 medicine and industry. 

2. Genetic alteration of organisms to possess desired traits 
Artificial chromosomes are ideally suited for preparing organisms, such 
as plants, that possess certain desired traits, such as, for example, disease 
resistance, resistance to harsh environmental conditions, altered growth 

15 patterns and enhanced physical characteristics. With respect to plants, the 

choice of the particular nucleic acid that will be delivered to recipient cells via 
artificial chromosomes often will depend on the purpose of the 
transformation. One of the major purposes of transformation of crop and 
tree species is to add some commercially desirable, agronomically important 

20 traits to the plant. Such traits include, but are not limited to, input and 

output traits such as herbicide resistance or tolerance, insect resistance or 
tolerance, disease resistance or tolerance (viral, bacterial, fungal or 
nematode), stress tolerance and/or resistance, as exemplified by resistance 
or tolerance to drought, heat, chilling, freezing, excessive moisture, salt 

25 stress and oxidative stress, increased yields, food content and makeup, 

physical appearance, male sterility, drydown, standability, prolificacy, starch 
quantity and quality, oil quantity and quality, protein quantity and quality and 
amino acid composition. It may be desirable to incorporate one or more 
genes conferring such desirable traits into host plants. 
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a. Herbicide resistance 

The genes encoding phosphinothricin acetyltransferase (bar and pat), 
glyphosate tolerant EPSP synthase genes, the glyphosate degradative 
enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a 
5 dehalogenase enzyme that inactivates dalapon), herbicide resistant 

(e. ^sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes 
(encoding a nitrilase enzyme that degrades bromoxynil) are all examples of 
herbicide resistant genes for use in plant transformation. The bar and pat 
genes code for an enzyme, phosphinothricin acetyltransferase (PAT), which 

lO inactivates the herbicide phosphinothricin and prevents this compound from 
inhibiting gfuatamine synthetase enzymes. The enzyme 5- 
enolpyruvylshikimate 3-phosphate synthase (EPSP synthase) is normally 
inhibited by the herbicide N-(phosphonomethyl)glycine (glyphosate). 
However, genes are known that encode glyphosate-resistant EPSP synthase 

15 enzymes. The deh gene encodes the enzyme dalapon dehalogenase and 
confers resistance to the herbicide dalapon. The bxn gene codes for a 
specific nitrilase enzyme that converts bromoxynil to a non-herbicidal 
degradation product. 

b. Insect and other pest resistance 

20 Insect-resistant organisms may be prepared in which resistance or 

decreased susceptibility to insect-induced disease is conferred by 
introduction into the host organism or embryo of artificial chromosomes 
containing DNA encoding gene products (e.g.. ribozymes and proteins that 
are toxic to certain pathogens) that destroy or attenuate pathogens or limit 

25 access of pathogens to the host. Potential insect resistance genes that can 
be introduced into plants via artificial chromosomes include Bacillus 
thuringiensfs crystal toxin genes or Bt genes (see, e.g.,, Watrud efa/. (1985) 
in Engineered Organisms and the Environment) . Bt genes may provide 
resistance to lepidopteran or coleopteran pests such as the European Corn 

30 Borer (ECB). Such Bt toxin genes include the CrylA(b) and CrylA(c) genes- 



WO 02/096923 



PCT7US02/17451 



-138- 

Endotoxin genes from other species of B. thuringiensis which affect insect 
growth or development also may be employed in this regard. Bt gene 
sequences can be modified to effect increased expression in plants, and 
particularly monocot plants. Means for preparing synthetic genes are well 
5 known in the art and are disclosed in, for example, U.S. Patent Nos. 
5,500,365 and 5,689,052. Examples of such modified Bt toxin genes 
include a synthetic Bt CrylA(b) gene (see, e.g., Perlak eta/. (1991) Proc. 
NatL Acad. Sci. U.S.A. 88:3324-3328) and the synthetic CrylA(c) gene 
termed 1800b (see PCT Application publication no. WO95/06128). 

lO Examples of the types of genes that may be transferred into plants via 

artificial chromosomes to generate disease- and/or insect-resistant transgenic 
plants include, but are not limited to, the crylA(b) and cry/A (c) genes which 
yield products that are highly toxic to two major rice insect pests (the striped 
stem borer and the yellow stem borer) (see, e.g., Cheng et aL (1998) Proc. 

15 Natf. Acad. ScL U.S.A. 95:2767-2772), cry3 genes which encode products 
that are toxic to Coleopteran insects that attack a variety of plants, including 
grains and legumes (see, e.g., U.S. Patent No. 6,023,013), genes (e.g., DNA 
encoding tricothecene 3-O-acetyltransferase) that confer resistance to 
tricothecenes such as those produced by plant fungi {e.g., Fusarium) in 

20 plants particularly susceptible to fungi (e.g., wheat, rye, barley, oats, and 

maize) (see, e.g., PCT Application publication no. WO 00/60061), and genes 
involved in multi-gene biosynthetic pathways that yield antipathogenic 
substances that have a deleterious effect on the growth of plant pathogens 
(see, e.g., U.S. Patent No. 5,639,949). 

25 Protease inhibitors may also provide insect resistance (see, e.g., 

Johnson eta/. (1989) and will thus have utility in plant transformation. The 
use of a protease inhibitor II gene, pin//, from tomato or potato may be 
particularly useful. The combined effect of the use of a pin/1 gene with a Bt 
toxin gene can produce synergistic insecticidal activity. Other genes that 

30 encode inhibitors of the insect's digestive system, or those that encode 
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enzymes or co-factors that facilitate the production of inhibitors, also may be 
useful. This group may be exemplified by oryzacystatin and amylase 
inhibitors such as those from wheat and barley. 

Genes encoding lectins may confer additional or alternative insecticide 
5 properties. Lectins (originally termed phytohemagglutinins) are multivalent 
carbohydrate-binding proteins which have the ability to agglutinate red blood 
cells from a range of species. Lectins have been identified as insecticidal 
agents with activity against weevils, ECB and rootworm (see, e.g., Murdock 
eta/. (1990) Phytochemistry 25:85-89; Czapla & Lang (1990) J. Econ. 

10 EntomoL 33:2480-2485). Lectin genes that may be useful include, for 
example, barley and wheat germ agglutinin (WGA) and rice lectins 
(Gatehouse eta/. (1984) J. ScL Food. Agric. 55:373-380). 

Genes controlling the production of large and small polypeptides active 
against insects when introduced into the insect pests, such as, for example, 

15 lytic peptides, peptide hormones and toxins and venoms, may also be useful 
in generating pest-resistant plants. For example, expression of juvenile 
hormone esterase, directed toward specific insect pests, also may result in 
insecticidal activity, or cause cessation of metamorphosis (see, e.g. , 
Hammock eta/. (1990) Nature 344:458-461). 

20 Transgenic plants expressing genes which encode enzymes that affect 

the integrity of the insect cuticle are additional examples of genes that may 
be transferred to plants via artificial chromosomes to confer resistance to 
insects. Such genes include those encoding, for example, chitinase, 
proteases, lipases and also genes for the production of nikkomycin, a 

25 compound that inhibits chitin synthesis, the introduction of any of which 
may be used to produce insect-resistant plants. Genes that affect insect 
molting, such as those affecting the production of ecdysteroid UDP-glucosyl 
transferase, also can be useful transgenes. 

Genes that code for enzymes that facilitate the production of 

30 compounds that reduce the nutritional quality of the host plant to insect 
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pests may also be used to confer insect resistance on plants. It may be 
possible, for instance, to confer insecticidai activity on a plant by altering its 
sterol composition. Sterols are obtained by insects from their diet and are 
used for hormone synthesis and membrane stability. Therefore, alterations in 
5 plant sterol composition by expression of genes that directly promote the 
production of undesirable sterols or those that convert desirable sterols into 
undesirable forms, could have a negative effect on insect growth and/or 
development and hence endow the plant with insecticidai activity. 
Lipoxygenases are naturally occurring plant enzymes that have been shown 

10 to exhibit anti-nutritional effects on insects and to reduce the nutritional 
quality of their diet. Therefore, transgenic plants with enhanced 
lipoxygenase activity may be resistant to insect feeding. 

Tripsacum dactyioides is a species of grass that is resistant to certain 
insects, including corn root worm. Tripsacum may thus include genes 

15 encoding proteins that are toxic to insects or are involved in the biosynthesis 
of compounds toxic to insects. Such genes may be useful in conferring 
resistance to insects. It is known that the basis of insect resistance in 
Tripsacum is genetic, because said resistance has been transferred to Zea 
mays via sexual crosses (Branson and Guss, 1972). It is further anticipated 

20 that other cereal, monocot or dicot plant species may have genes encoding 
proteins that are toxic to insects which would be useful for producing insect 
resistant plants- 
Further genes encoding proteins characterized as having potential 
insecticidai activity also may be used as transgenes in accordance herewith. 

25 Such genes include, for example, the cowpea trypsin inhibitor (CpT1: Hilder 
et al. t 1987) which may be used as a rootworm deterrent, genes encoding 
avermectin {Avermectin and Abamectin., Campbell, W.C., Ed., 1989: Ikeda 
et aL, 1987) which may prove particularly useful as a com rootworm 
deterent, ribosome inactivating protein genes and even genes that regulate 

30 plant structures. Transgenic plants including anti-insect antibody genes and 
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genes that code for enzymes that can convert a non-toxic insecticide (pro- 
insecticide) applied to the outside of the plant into an insecticide inside the 
plant also are contemplated. 

c. Disease resistance 
5 Transgenic organisms, such as plants, that express genes that confer 

resistance or reduce susceptibility to disease are of particular interest. For 
example, the transgene may encode a protein that is toxic to a pathogen, 
such as a virus, fungus, mycotoxin-producing organism, nematode or 
bacterium, but that is not toxic to the transgenic host. 

lO Because multiple genes can be introduced on an. artificial 

chromosome, a series of genes encoding a genetic pathway involved in 
disease resistance or tolerance can be introduced into crop plants. For 
example, it is known that often numerous genes are expressed upon 
pathogen invasion, typically one or more "PR", or pathogen related, proteins 

15 are expressed in response to invasion of a plant bacterial or fungal pathogen. 
One or more of the proteins involved in conferring resistance to pathogens 
can be contained within an artificial chromosome and therefore be expressed 
in a plant cell, in particular a whole transgenic plant as described herein. In 
addition, production of single-chain Fv recombinant antibodies in plants may 

20 extend the range of possibilities for the introduction of pathogen protection 
in crop plants (see, e.g., Tavladoraki et aL (1993) Nature 36*6:469-472). 

It has been demonstrated that expression of a viral coat protein in a 
transgenic plant can impart resistance to infection of the plant by that virus 
and perhaps other closely related viruses (Cuozzo et aL, 1988. Hemenway et 

25 aL, 1988, Abel et aL, 1986). Expression of antisense genes targeted at 

essential viral functions may also impart resistance to viruses. For example, 
an antisense gene targeted at the gene responsible for replication of viral 
nucleic acid may inhibit replication and lead to resistance to the virus. 
Interference with other viral functions through the use of antisense genes 

30 also may increase resistance to viruses. Further, it may be possible to 
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achieve resistance to viruses through other approaches, including, but not 
limited to the use of satellite viruses. Artificial chromosomes are ideally 
suited for carrying a multiplicity of these genes and DNA sequences which 
are useful for conferring a broad range of resistance to many pathogens. 
5 Genes encoding so-called "peptide antibiotics," pathogenesis related 

(PR) proteins, toxin resistance, and proteins affecting host-pathogen 
interactions such as morphological may also be useful, particularly in 
conferring increased resistance to diseases caused by bacteria and fungi. 
Peptide antibiotics are polypeptide sequences which are inhibitory to growth 

10 of bacteria and other microorganisms. For example, the classes of peptides 
referred to as cepropins and magainins inhibit growth of may species of 
bacteria and fungi. Expression of PR proteins in monocotyledonous plants 
such as maize may be useful in conferring resistance to bacterial disease. 
These genes are induced following pathogen attack on a host plant and have 

15 been divided into at lease five classes of proteins (Bio. Linthorst, and 

Cornelissen, 1990). Included among the PR proteins are £-1 , 3-glucanases, 
chitinases, and osmotin and other proteins that are believed to function in 
plant resistance to disease organisms. Other genes have been identified that 
have antifungal properties, e.g., UDA (stinging nettle lectin) and hevein 

20 (Broakaert et aL, 1989; Barkai-Golan era/., 1978). It is known that certain 
plant diseases are caused by the production of phytotoxins. Resistance to 
these diseases may be achieved through expression of a gene that encodes 
an enzyme capable of degrading or otherwise inactivating the phytotoxin. It 
also is contemplated that expression of genes that alter the interactions 

25 between the host plant and pathogen may be useful in reducing the ability of 
the disease organism to invade the tissues of the host plant, e.g., an 
increase in the waxiness of the leaf cuticle or other morphological 
characteristics. 

d. Environment or stress resistance 
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Improvement of a plant's ability to tolerate various environmental 
stresses such as, but not limited to, drought, excess moisture, chilling, 
freezing, high temperature, salt, and oxidative stress, also can be effected 
through expression of genes therein. It is proposed that benefits may be 
5 realized in terms of increased resistance to freezing temperatures through the 
introduction of an "antifreeze" protein such as that of the Winter Flounder 
(Cutler eta/., 1989) or synthetic gene derivatives thereof. Improved chilling 
tolerance also may be conferred through increased expression of glycerol-3- 
phosphate acetyltransferase in chloroplasts (Wolter et aL, 1992). Resistance 

10 to oxidative stress in some crop species (often exacerbated by conditions 
such as chilling temperatures in combination with high light intensities) can 
be conferred by expression of superoxide dismutase (Gupta et aL, 1993), 
and may be improved by glutathione reductase (Bowler et al., 1992). Such 
strategies may allow for tolerance to freezing in newly emerged fields as well 

15 as extending later maturity higher yielding varieties to earlier relative maturity 
zones. 

It is contemplated that the expression of genes that favorably effect 
plant water content, total water potential, osmotic potential, and turgor will 
enhance the ability of the plant to tolerate drought. As used herein, the 

20 terms "drought resistance" and drought tolerance" are used to refer to a 

plant's increased resistance or tolerance to stress induced by a reduction in 
water availability, as compared to normal circumstances, and the ability of 
the plant to function and survive in lower-water environments. The 
expression of genes encoding for the biosynthesis of osmotically-active 

25 solutes, such as polyol compounds, may impart protection against drought. 
Within this class are genes encoding for mannitoI-L-phosphate 
dehydrogenase (Lee and Saier, 1982) and trehalose-6-phosphate synthase 
(Kaasen et a!., 1992). Through the subsequent action of native 
phosphatases in the cell or by the introduction and coexpression of a specific 

30 phosphatase, these introduced genes will result in the accumulation of either 
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mannitol or trehalose, respectively, both of which have been well 
documented as protective compounds able to mitigate the effects of stress. 
Mannitol accumulation in transgenic tobacco has been verified and 
preliminary results indicate that plants expressing high levels of this 
5 metabolite are able to tolerate an applied osmotic stress (Tarczynski etaL, 
1992, 1993). 

Similarly, the efficacy of other metabolites in protecting either enzyme 
function (e.g., alanopine or propionic acid) or membrane integrity [e.g., 
alanopine) has been documented (Loomis etaL, 1989), and therefore 

lO expression of genes encoding for the biosynthesis of these compounds might 
confer drought resistance in a manner similar to or complimentary to 
mannitol. Other examples of naturally occurring matabolites that are 
osmotically active and/or provide some direct protective effect during 
drought and/or desiccation include fructose, erythritol (Coxson et ah, 1992), 

15 sorbitol, dulcitol (Karsten ef a/. f 1 992), glucosylglycerol (Reed eta/., 1984; 
ErdMann etaL, 1992), sucrose, stachyose (Koster and Leopold, 1988: 
Blackman etaL, 1992), raffinose (Bernal-Lugo and Leopold, 1992), proline 
(Rensburg etaL, 1993), glycine betaine, ononitol and pinitol (Vernon and 
Bohnert, 1992). Continued canopy growth and increased reproductive 

20 fitness during times of stress will be augmented by introduction and 
expression of genes such as those controlling the osmotically active 
compounds discussed above and other such compounds. Genes which 
promote the synthesis of an osmotically active polyol compound include 
genes which encode the enzymes mannitol- 1 -phosphate dehydrogenase, 

25 trehalose-6-phosphate synthase and myoinositol O-methyltransferase. 

Artificial chromosomes can carry a multiplicity of genes to provide durable 
stress tolerance, for example, concominant expression of proline and ketane 
and/or poly-ols. 

It is contemplated that the expression of specific proteins also may 
30 increase drought tolerance under certain conditions or in certain crop 
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species. These may include proteins such as Late Embryogenic Proteins (see 
Dure era/., 1989). All three classes of LEAs have been demonstrated in 
maturing (i.e. desiccating) seeds. Within LEA proteins, the Type-ll (dehydrin- 
type) have generally been implicated in drought and/or desiccation tolerance 
5 in vegetative plant parts {i.e. Mundy and Chua, 1988: Piatkowski et aL, 

1990: Yamaguchi-Shinozaki et aL, 1992). Recently, expression of a Type-Ill 
LEA (HVA-1 ) in tobacco was found to influence plant height, maturity and 
drought tolerance (Fitzpatrick, 1993). In rice, expression of the HVA-1 gene 
influenced tolerance to water deficit and salinity (Xu era/., 1996). 

10 Expression of structural genes from all three LEA groups may therefore 
confer drought tolerance. Other types of proteins induced during water 
stress include thiol proteases, aldolases and transmembrane transporters 
(Guerrero et aL, 1999), which may confer various protective and/or repair- 
type functions during drought stress. It is also is contemplated that genes 

15 that effect lipid biosynthesis and hence membrane composition might also be 
useful in conferring drought resistance on the plant. 

Many of these genes for improving drought resistance have 
complementary modes of action. Thus, combinations of these genes might 
have additive and/or synergistic effects in improving drought resistance in 

20 plants. Many of these genes also improve freezing tolerance (or resistance): 
the physical stresses incurred during freezing and drought are similar in 
nature and may be mitigated in similar fashion. Benefit may be conferred via 
constitutive expression of these genes, but the preferred means of 
expressing these genes may be through the use of a turgor-induced promoter 

25 (such as the promoters for the turgor-induced genes described in Guerrero ef 
aL, 1990 and Shagan et aL, 1993 which are incorporated herein by 
reference). Spatial and temporal expression patterns of these genes may 
enable plants to better withstand stress. 

It is proposed that expression of genes that are involved with specific 

30 morphological traits that allow for increased water extractions from drying 
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soil would be of benefit. For example, introduction and expression of genes 
that alter root characteristics may enhance water uptake. It also is 
contemplated that expression of genes that enhance reproductive fitness 
during times of stress would be of significant value. For example, expression 
5 of genes that improve the synchrony of pollen shed and receptiveness of the 
female flower parts, i.e., silks, would be of benefit. In addition it is 
proposed that expression of genes that minimize kernel abortion during times 
of stress would increase the amount of grain to be harvested and hence be 
of value. 

10 Given the overall role of water in determining yield, it is contemplated 

that enabling plants to utilize water more efficiently, through the introduction 
and expression of genes, will improve overall performance even when soil 
water availability is not limiting. By introducing genes that improve the 
ability of plants to maximize water usage across a full range of stresses 

15 relating to water availability, yield stability or consistency of yield 
performance may be realized. 

e. Plant agronomic characteristics 
Plants possessing desired traits that might, for example, enhance 
utility, processibility and commercial value of the organisms in areas such as 

20 the agricultural and ornamental plant industries may also be generated using 
artificial chromosomes in the same manner as described above for production 
of disease-resistant organisms. In such instances, the artificial chromosomes 
that are introduced into the organism or embryo contain DNA encoding gene 
products that serve to confer the desired trait in the organism. 

25 For example, transgenic plants having improved flavor properties, 

stability and/or quality are of commercial interest. One possible method for 
generating such plants may include the expression of transgenes, e.g. , genes 
encoding cystathionine gamma synthase (CGS), that result in increased free 
methionine levels (see, e.g., PCT Application publication no. WO 00/55303). 



WO 02/096923 



PCT/US02/17451 



-147- 

Two of the factors determining where crop plants can be grown are 
the average daily temperature during the growing season and the length of 
time between frosts. Within the areas where it is possible to grow a 
particular crop, there are varying limitations on the maximal time it is allowed 
5 to grow to maturity and be harvested. For example, a variety to be grown in 
a particular area is selected for its ability to mature and dry down to 
harvestable moisture content within the required period of time with 
maximum possible yield. Therefore, crops of varying maturities are 
developed for different growing locations. Apart from the need to dry down 

10 sufficiently to permit harvest, it is desirable to have maximal drying take 
place in the field to minimize the amount of energy required for additional 
drying post-harvest. Also, the more readily a product such as grain can dry 
down, the more time there is available for growth and kernel fill. Genes that 
influence maturity and/or dry down can be identified and introduced into 

15 plant lines using transformation techniques to create new varieties adapted 
to different growing locations or the same growing location, but having 
improved yield to moisture ratio at harvest. Expression of genes that are 
involved in regulation of plant development may be especially useful. 
Genes that would improve standability and other plant growth 

20 characteristics may also be introduced into plants. Expression of new genes 
in plants which confer stronger stalks, improved root systems, or prevent or 
reduce ear droppage would be of great value to the farmer. Introduction and 
expression of genes that increase the total amount of photoassimilate 
available by, for example, increasing light distribution and/or interception 

25 would be advantageous. In addition, the expression of genes that increase 
the efficiency of photosynthesis and/or the leaf canopy would further 
increase gains in productivity. Expression of a phytochrome gene in crop 
plants may be advantageous. Expression of such a gene may be reduce 
apical dominance, confer semidwarfism on a plant, and increase shade 
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tolerance {U.S. Patent No. 5,268,526). Such approaches would allow for 
increased plant populations in the field. 

f. Nutrient utilization 

The ability to utilize available nutrients may be a limiting factor in 
5 growth of crop plants. It may be possible to alter nutrient uptake, tolerate 
pH extremes, mobilization through the plant, storage pools, and availability 
for metabolic activities by the introduction of new agents. These 
modifications would allow a plant such as maize to more efficiently utilize 
available nutrients. An increase In the activity of, for example, an enzyme 

10 that is normally present in the plant and involved in nutrient utilization may 
increase the availability of a nutrient. An example of such an enzyme would 
be phytase. It is further contemplated that enhanced nitrogen utilization by a 
plant is desirable. Expression of a glutamate dehydrogenase gene in plants, 
e.g., E. coli gdhA genes, may lead to enhanced resistance to the herbicide 

15 glufosinate by incorporation of excess ammonia into glutamate, thereby 
detoxifying the ammonia. Gene expression may make a nutrient source 
available that was previously not accessible, e.g., an enzyme that releases a 
component of nutrient value from a more complex molecule, perhaps a 
macromolecule. Alternatively, artificial chromosomes can carry the 

20 multiplicity of genes governing nodulation and nitrogen fixation in legumes. 
The artificial chromosomes could be used to promote nodulation in non- 
legume species. 

g. Male sterility 

Male sterility is useful in the production of hybrid seed. Male sterility 
25 may be produced through gene expression. For example, it has been shown 
that expression of genes that encode proteins that interfere with 
development of the male inflorescence and/or gametophyte result in male 
sterility. Chimeric ribonuclease genes that express in the anthers of 
transgenic tobacco and oilseed rape have been demonstrated to lead to male 
30 sterility (Mariani et aL, 1990). Other methods of conferring male sterility 
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have been described, including gene encoding antisense RNA capable of 
causing male sterility (U.S. Patent Nos. 6,184,439, 6,191,343 and 
5,728,926) and methods utilizing two genes to confer sterility, see, e.g., 
U.S. Patent No. 5,426,041. 
5 A number of mutations were discovered in maize that confer 

cytoplasmic male sterility. On<e mutation in particular, referred to as T 
cytoplasm, also correlates with sensitivity to Southern corn leaf blight. A 
DNA sequence, designated TURF-13 (Levings, 1 990), was identified that 
correlates with T cytoplasm. It is proposed that it would be possible through 

lO the introduction of TURF-13 via transformation, to separate male sterility 

from disease sensitivity. As it is necessary to be able to restore male fertility 
for breeding purposes and for grain production, it is proposed that genes 
encoding restoration of male fertility also may be introduced, 
h. Improved nutritional content 

15 Genes may be introduced into plants to improve the nutrient quality or 

content of a particular crop. Introduction of genes that alter the nutrient 
composition of a crop may greatly enhance the feed or food value. For 
example, the protein of many grains is suboptimal for feed and food purposes 
especially when fed to pigs, poultry, and humans. The protein is deficient in 

20 several amino acids that are essential in the diet of these species, requiring 
the addition of supplements to the grain. Limiting essential amino acids may 
include lysine, methionine, tryptophan, threonine, valine, arginine, and 
histidine. Some amino acids become limiting only after corn is supplemented 
with other inputs for feed formulations. The levels of these essential amino 

25 acids in seeds and grain may be elevated by mechanisms which include, but 
are not limited to, the introduction of genes to increase the biosynthesis of 
the amino acids, increase the storage of the amino acids in proteins, or 
increase transport of the amino acids to the seeds or grain. 

The protein composition of a crop may be altered to improve the 

30 balance of amino acids in a variety of ways including elevating expression of 
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native proteins, decreasing expression of those with poor composition 
changing the composition of native proteins, or introducing genes encoding 
entirely new proteins possessing superior composition. 

The introduction of genes that alter the oil content of a crop plant may 
5 also be of value. Increases in oil content may result in increases in 

metabolizable-energy-content and density of seeds for use in feed and food. 
The introduced genes may encode enzymes that remove or reduce rate- 
limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes 
may include, but are not limited to, those that encode acetyl-CoA 

10 carboxylase, ACP-acyltransf erase, /?-ketoacyl-ACP synthase, plus other well 
known fatty acid biosynthetic activities. Other possibilities are genes that 
encode proteins that do not possess enzymatic activity such as acyl-carrier 
proteins. Genes may be introduced that after the balance of fatty acids 
present in the oil providing a more healthful or nutritive feedstuff. The 

1 5 introduced DNA also may encode sequences that block expression of 

enzymes involved in fatty acid biosynthesis, altering the proportions of fatty 
acids present in crops. 

Genes may be introduced that enhance the nutritive value of the 
starch component of crops, for example by increasing, or in some cases 

20 decreasing, the degree of branching, resulting in improved utilization of the 
starch in livestock by delaying its metabolism. Additionally, other major 
constituents of a crop may be altered, including genes that affect a variety of 
other nutritive, processing, or other quality aspects. For example, 
pigmentation may be increased or decreased. 

25 Feed or food crops may also possesses insufficient quantities of 

vitamins, requiring supplementation to provide adequate nutritive value. 
Introduction of genes that enhance vitamins biosynthesis may be envisioned 
including, for example, vitamins A (e.g. rice with Vitamin A or golden rice), 
E, B12 choline, and the like. Mineral content may also be sub-optimal. Thus 

30 genes that affect the accumulation or availability of^compounds containing 
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phosphorus, sulfur, calcium, manganese, zinc, and iron among others would 
be valuable. 

Numerous other examples of improvements of crops may be effected 
using the artificial chromosomes, with appropriate heterologous genes 
5 contained therein, in accordance with the methods and compositions 

provided herein. The improvements may not necessarily involve grain, but 
may, for example, improve the value of a crop for silage. Introduction of 
DNA to accomplish this might include sequences that alter lignin production 
such as those that result in the "brown midrib 1 ' phenotype associated with 

10 superior feed value for cattle. 

In addition to direct improvements in feed or food value, genes also 
may be introduced which improve the processing of crops and improve the 
value of the products resulting from the processing. One use of crops is via 
wetmiiling. Thus, genes that increase the efficiency and reduce the cost of 

15 such processing, for example, by decreasing steeping time may also find use. 
Improving the value of wetmiiling products may include altering the quantity 
or quality of starch, oil, corn gluten meal, or the components of gluten feed. 
Elevation of starch may be achieved through the identification and 
elimination of rate limiting steps in starch biosynthesis or by decreasing 

20 levels of the other components of crops resulting in proportional increases in 
starch. 

Oil is another product of wetmiiling, the value of which may be 
improved by introduction and expression of genes. Oil properties maybe be 
altered to improve its performance in the production and use of cooking oil, 

25 shortenings, lubricants or other oil-derived products or improvements of its 

health attributes when used in the food-related applications. Fatty acids also 
may be synthesized which upon extraction can serve as starting materials for 
chemical syntheses. The changes in oil properties may be achieved by 
altering the type, level, or lipid arrangement of the fatty acids present in the 

30 oil. This in turn may be accomplished by the addition of genes that encode 
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enzymes that catalyze the synthesis of new fatty acids and the lipids 
possessing them or by increasing levels of native fatty acids while possibly 
reducing levels of precursors. Alternatively, DNA sequences may be 
introduced which slow or block steps in fatty acid biosynthesis resulting in 
5 the increase in precursor fatty acid intermediates. Genes that might be 

added include desaturases, epoxidases, hydratases, dehydratases and other 
enzymes that catalyze reactions involving fatty acid intermediates. 
Representative examples of catalytic steps that might be blocked include the 
desaturations from stearic to oleic acid and oleic to Hnolenic acid resulting in 

lO the respective accumulations of stearic and oleic acids. Another example is 
the blockage of elongation steps resulting in the accumulation of C8 to CI 2 
saturated fatty acids. 

i. Production of chemicals or biologicals 

Transgenic plants can be used as protein production systems to 

15 generate recombinant products ranging from industrial enzymes, viral 

antigens, vaccines, antibodies, human blood proteins, cytokines, growth 
factors, enkephalins, serum albumin and other proteins of clinical relevance 
and pharmaceuticals. For example, enzymes including a-amylase, glucanase, 
phytase and xylanase (see, Goddijn and Pen (1995) Trends BiotechnoL 

20 73:379-387; Pen eta/. (1992) Bio/Technology 70:292-296; Horvath eta/. 
(20O0) Proc. Natl. Acad. Sci. U.S.A. 97: 1 9 14-1 91 9; and e.g., Herbers and 
Sonnewald (1996) in Transgenic Piants: A Production System for industria/ 
and Pharmaceutica/ Proteins" Owen and Pen Eds., John Wiley & Sons, West 
Sussex, England). 

25 Examples of medically relevant proteins that may be produced in 

plants include surface antigens of viral pathogens, such as hepatitis B virus 
and transmissible gastroenteritis virus spike protein, for use in vaccines. The 
proteins thus produced may be isolated and administered through standard 
vaccine introduction methods or through the consumption of the edible 

30 transgenic plant as food which can be taken orally (see, e.g., U.S. Patent No. 
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6,136,320 and Mason et ah (1992) Proc. Natl. Acad. Sci. U.S.A. SS:11745- 
1 1749). HIV, rhinovirus, malarial and rabies virus antigens are additional 
examples of that may be expressed in plants as candidate vaccines (see, 
e.g., Porta et ah (1994) Viroh 202:949-955; Turpen et ah (1995) 
5 Bio/Technology 73:53-57; and McGarvey et ah (1995) Bio/Technology 

13: 1484-1 487). Antibodies may also be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
(scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 
Bio/Technology 13: 1090-1 093) and IgG (Ma et ah (1995) Science 265:716- 
10 719). 

Examples of human biopharmaceuticals that may be expressed in 
plants include, but are not limited to, albumin (Sijmons et ah (1990)), 
enkephalins (Vandekerckhove et ah (1989) ), interferon-*/ (Zhu et ah (1994) 
and GM-CSF (Ganz et ah (1996) in Transgenic Plants: A Production System 

15 for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley 8t 
Sons, West Sussex, England, pp. 281-297; and Sardana et ah (1998) in 
Methods in Biotechnology, Voh 3: Recombinant Proteins from Plants: 
Production and isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

20 Transgenic plants producing these compounds are made possible by 

the introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 

25 intermediary metabolites, carbohydrate polymers, enzymes for uses in 

bioremediation, enzymes for modifying pathways that produce secondary 
plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 

30 plastics. The compounds may be produced by the plant, extracted upon 
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harvest and/or processing, and used for any presently recognized useful 
purpose such as pharmaceuticals, fragrances, and industrial enzymes to 
name a few. Alternatively, plants produced in accordance with the methods 
and compositions provided herein may be made to metabolize certain 
5 compounds, such as hazardous wastes, thereby allowing bioremediation of 
these compounds. 



Nucleic acids may be introduced into plants that are designed to 
down-regulate or supress a plant-encoded gene. A number of different means 

10 to achieve down regulation have been demonstrated in the art, including 

antisense RNA, ribozymes and co-suppression. The use of antisense RNA to 
suppress plant genes is described, for example, in U.S. Patent Nos. 
4,801,54-0, 5,107,065 and 5,453,566. In such methods, an "antisense" 
gene is constructed that encodes an RNA that is complementary to the 

15 mRNA of a resident plant gene, such that expression of the antisense gene 
inhibits the translation of the mRNA of the resident plant gene. Thus, the 
activity of the resident gene is down-regulated. 

An additional method of down regulating gene activities involves 
ribozymes, or catalytic hammerhead hairpin RNA structures. The use of 

20 ribozymes is described, for example, in U.S. Patent Nos. 4,987,071, 
5,037,746, 5,1 16,742 and 5,354,855. These methods rely on the 
expression of small catalytic "hammerhead" RNA molecules that are capable 
of binding to and cleaving specific RNA sequences. Ribozymes designed to 
specifically recognize a resident plant mRNA can be used to cleave the 

25 mRNA and prevent its proper expression. 

Essentially a more or less equivalent down-regulation control of gene 
activities by ribozymes and antisense can be achieved by adding additional 
copies of the gene to be regulated. The process is referred to as co- 
suppression and is described in, for example, U.S. Patent Nos. 5,034,323, 



j- 



Non-protein-expressing sequences 



30 



5,283,1 84 and 5,231 ,020. 



WO 02/096923 



PCT/US02/17451 



-155- 

Numerous plant genes may be targeted for down regulation. For 
example, a gene may be down-regulated that encodes an enzyme that 
catalyzes a reaction in a plant. Reduction of the enzyme activity may reduce 
or eliminate products of the reaction which include any enzymatically 
5 synthesized compound in the plnat such as fatty acids, amino acids, 

carbohydrates, nucleic acids and the like. Alternatively, the protein may be a 
storage protein, such as zein, or a structural protein, the decreased 
expression of which may lead to changes in seed amino acid composition or 
plant morphological changes, respectively. The possibilities cited above are 
10 provided only by way of example and do not represent the full range of 
applications. 

(1). Antisense RNA 

Genes may be constructed, which when transcribed, produce 
antisense RNA that is complementary to all or part(s) of a targeted 

15 messenger RNA(s). The antisense RNA reduces production of the 

polypeptide product of the messenger RNA. The polypeptide product may be 
any protein encoded by the plant genome. The aforementioned genes will be 
referred to as antisense genes. An antisense gene may thus be introduced 
into a plant by transformation methods to produce a transgenic plant with 

20 reduced expression of a selected protein of interest. For example, the 

protein may be an enzyme that catalyzes a reaction in the plant. Reduction 
of the enzyme activity may reduce or eliminate products of the reaction 
which include any enzymatically synthesized compound in the plant such as 
fatty acids, amino acids, carbohydrates, nucleic acids and the Jike. 

25 Alternatively, the protein may be a storage protein, such as a zein, or a 

structural protein, the decreased expression of which may lead to changes in 
seed amino acid composition or plant morphological changes respectively. 
The possibilities cited above are provided only by way of example and do not 
represent the full range of applications. 

30 (2.) Ribozymes 
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Genes also may be constructed or isolated, which when transcribed, 
produce RNA enzymes (ribozymes) which can act as endoribonucleases and 
catalyze the cleavage of RNA molecules with selected sequences. The 
cleavage of selected messenger RNAs can result in the reduced production of 
5 their encoded polypeptide products. These genes may be used to prepare 
transgenic plants which possess them. The transgenic plants may possess 
reduced levels of polypeptides including, but not limited to, the polypeptides 
cited above. 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a 
10 site-specific fashion. Ribozymes have specific catalytic domains that 

possess endonuclease activity (Kim and Cech, 1987; Gerlach eta/., 1987; 
Forster and Symons, 1987). For example, a large number of ribozymes 
accelerate phosphoester transfer reactions with a high degree of specificity, 
often cleaving only one of several phophoesters in an oligonucleotide 
15 substrate (Cech eta/., 1981; Michel and Westhof, 1990); Reinhofd-Hurek 
and Shub, 1992). This specificity has been attributed to the requirement 
that the substrate bind via specific base-pairing interactions to the internal 
guide sequence ("IGS") of the ribozyme prior to chemical reaction. 

Ribozyme catalysis has primarily been observed as part of sequence- 
20 specific cleavage/iigation reactions involving nucleic acids (Joyce, 1 989; 

Cech et aL, 1981). For example, U.S. Patent 5,354,855 reports that certain 
ribozymes can act as endonucleases with a sequence specificity greater than 
that of known ribonucleases and approaching that of the DNA restriction 
enzymes. 

25 Several different ribozyme motifs have been described with RNA 

cleavage activity (Symons, 1992). Examples include sequences from the 
Group I self splicing introns including Tobacco Ringspot Virus (Prody et aL, 
1986), Avacado Sunblotch Viroid (Palukaitis et aL, 1979; Symons, 1981) 
and Lucerne Transient Streak Virus (Forster and Symons, 1987). Sequences 
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from these and related viruses are referred to as hammerhead ribozyme 
based on a predicted folded secondary structure. 

Other suitable ribozymes include sequences from RNase P with RNA 
cleavage activity (Yuan et aL, 1992; Yuan and Altman, 1994; U.S. Patents 
5 5,168,053 and 5,624,824), hairpin ribozyme structures (Berzal-Herranz et 
aL, 1992; Chowrira et aL, 1993) and Hepatitis Delta virus based ribozymes 
(U.S. Patent 5,625,047). The general design and optimization of ribozyme 
directed RNA cleavage activity has been discussed in detail (Haselhoff and 
Gerlach, 1988; Symons, 1992; Chowrira et aL, 1994; Thompson et aL, 
lO 1995). 

The other variable on ribozyme design is the selection of a cleavage 
site on a given target RNA. Ribozymes are targeted to a given sequence by 
virtue of annealing to a site by complementary base pair interactions. Two 
stretches of homology are required for this targeting. These stretches of 

15 homologous sequences flank the catalytic ribozyme structure defined above. 
Each stretch of homologous sequence can vary in length from 7 to 1 5 
nucleotides. The only requirement for defining the homologous sequences is 
that, on the target RNA, they are separated by a specific sequence which is 
the cleavage site. For hammerhead ribozyme, the cleavage site is a 

20 dinucleotide sequence on the target RNA is a uracil (U) followed by either an 
adenine, cytosine or uracil (A, C or U) (Perriman et aL, 1992; Thompson et 
aL, 1995). The frequency of this dinucleotide occurring in any given RNA is 
statistically 3 out of 16. Therefore, for a given target messenger RNA of 
1,000 bases, 187 dinucleotide cleavage sites are statistically possible. 

25 Designing and testing ribozymes for efficient cleavage of a target RNA 

is a process well known to those skilled in the art. Examples of scientific 
methods for designing and testing ribozymes are described by Chowrira et aL 
(1994) and Lieber and Strauss (1995), each incorporated by reference. The 
identification of operative and preferred sequences for use in down regulating 

30 a given gene is simply a matter of preparing and testing a given sequence. 
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and is a routinely practiced "screening" method known to those of skill in the 
art. 

(3.) Induction of gene silencing 
It also is possible that genes may be introduced to produce transgenic 
5 plants which have reduced expression of a native gene product by the 

mechanism of co-suppression. It has been demonstrated in tobacco, tomato, 
and petunia (Goring et al. t 1991; Smith et aL, 1990; Napoli et aL, 1990; van 
der Krol et a/., 1990) that expression of the sense transcript of a native gene 
will reduce or eliminate expression of the native gene in a manner similar to 

10 that observed for antisense genes. The introduced gene may encode all or 
part of the targeting native protein but its translation may not be required for 
reduction of levels of that native protein. 

(4.) Non-RIMA-expressing sequences 
DNA elements including those of transposable elements such as Ds, 

15 Ac, or MU, may be inserted into a gene to cause mutations. These DNA 
elements may be inserted in order to inactivate (or activate) a gene and 
thereby "tag" a particular trait. In this instance the transposable element 
does not cause instability of the tagged mutation, because the utility of the 
element does not depend on its ability to move in the genome. Once a 

20 desired trait is tagged, the introduced DNA sequence may be used to clone 
the corresponding gene, e.g., using the introduced DNA sequence as a PCR 
primer together with PCR gene cloning techniques (Shapiro, 1983; Dellaporta 
et aL, 1988). Once identified, the entire gene(s) for the particular trait, 
including control or regulatory regions where desired, may be isolated, cloned 

25 and manipulated as desired. The utility of DNA elements introduced into an 
organism for purposes of gene tagging is independent of the DNA sequence 
and does not depend on any biological activity of the DNA sequence, i.e., 
transcription into RNA or translation into protein. The sole function of the 
DNA element is to disrupt the DNA sequence of a gene. 
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It is contemplated that unexpressed DNA sequences, including 
synthetic sequences, could be introduced into cells as proprietary "labels" of 
those cells and plants and seeds thereof. It would not be necessary for a 
label DMA element to disrupt the function of a gene endogenous to the host 
5 organism, as the sole function of this DNA would be to identify the origin of 
the organism. For example, one could introduce a unique DNA sequence into 
a plant and this DNA element would identify all cells, plants, and progeny of 
these cells as having arisen from that labeled source. It is proposed that 
inclusion of label DNAs would enable one to distinguish proprietary 

10 germplasm or germplasm derived from such, from unlabelled germplasm. 

Another possible element which may be introduced is a matrix 
attachment region element (MAR), such as the chicken lysozyme A element 
(Stief, 1989), which can be positioned around an expressible gene of interest 
to effect an increase in overall expression of the gene and diminish position 

15 dependent effects upon incorporation into the plant genome (Stief et al. , 

1989; Phi-Van etal., 1990). Sequences such as MARs can be included on 

the artificial chromosome to enhance gene expression. 

3. Transgenic models for evaluation of genes and discovery of 
new traits 

20 Of significant interest is the use of plants and plant cells containing 

artificial chromosomes for the evaluation of new genetic combinations and 
discovery of new traits. Artificial chromosomes, by virtue of the fact that 
they can contain significant amounts of DNA can also therefore encode 
numerous genes and accordingly a multiplicity of traits. It is contemplated 

25 here that artificial chromosomes, when formed from one plant species, can 
be evaluated in a second plant species. The resultant phenotypic changes 
observed, for example, can indicate the nature of the genes contained within 
the DNA containing the artificial chromosome, and hence permit the 
identification of new genetic activities. Artificial chromsomes containing 

30 euchromatic DNA or partially containing euchromatic DNA can serve as a 
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valuable source of new traits when transferred to an alien plant cell 
environment. For example, it is contemplated that artificial chromosomes 
derived from dicot plant species can be introduced into monocot plant 
species by transfering a dicot artificial chromosome. The dicot artificial 
5 chromosome containing a region of euchromatic DNA containing expressed 
genes. 

The artificial chromosomes can be generated or manipulated in such a 
fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 

lO chromosome to contain new genetic activities and hence carry new traits. 
For example, an artificial chromosome can be introduced into a wild relative 
of a crop plant under conditions whereby a portion of the DNA present in the 
chromosomes of the wild relative is transferred to the artificial chromosome. 
After isolation of the artificial chromosome, this naturally occurring region of 

15 DNA from the wild relative, now located on the artificial chromosome can be 
introduced into the domesticated crop species and the genes encoded within 
the transferred DNA expressed and evaluated for utility. New traits and gene 
systems can be discovered in this fashion. 

Artificial chromosomes modified to recombine with plant DNA offer 

20 many advantages for the discovery and evaluation of traits in different plant 
species. When the artificial chromosome containing DNA from one plant 
species is introduced into a new plant species, new traits and genes can be 
introduced. This use of an artificial chromosome allows for the ability to 
overcome the sexual barrier that prevents transfer of genes from one plant 

25 species to another species. Using artificial chromosomes in this fashion 

allows for many potentially valuable traits to be identified including traits that 
are typically found in wild species. Other valuable applications for artificial 
chromosomes include the ability to transfer large regions of DNA from one 
plant species to another, DNA encoding potentially valuable traits such as 

30 altered oil, carbohydrate or protein composition, multiple genes encoding 
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enzymes capable of producing valuable plant secondary metabolites, genetic 
systems encoding valuable agronomic traits such as disease and insect 
resistance, genes encoding functions that allow association with soil 
bacterium such as growth promoting bacteria or nitrogen fixing bacteria, or 
5 genes encoding traits that confer freezing, drought or other stress tolerances. 
In this fashion, artificial chromosomes can be used to discover regions of 
plant DNA that encode valuable traits. 

The artificial chromosome can also be designed to allow the transfer 
and subsequent incorporation of these valuable traits now located on the 

lO artificial chromosome into the natural chromosomes of a plant species. In 
this fashion the artificial chromosomes can be used to transfer large regions 
of DNA encoding traits normally found in one plant species into another plant 
species. In this fashion, it is possible to derive a plant cell that no longer 
needs to carry an artificial chromosome to posses the new trait. Thus the 

15 artificial chromosome would serve as the transfer mechanism to permit the 
formation of plants with greater degree of genetic diversity. 

An artificial chromosome can be designed in a variety of ways to 
accomplish the afore-mentioned purposes. An artificial chromosome can be 
modified to contain sequences that promote homologous recombination 

20 within plant cells, or be modified to contain a genetic system that functions 
as a site-specific recombination system. For example, the DNA sequence of 
Arabfdopsfs is now known. To construct an artificial chromosome capable of 
recombining with a specific region of Arabfdopsis DNA, a sequence of 
Arabfdopsis DNA, normally located near a chromosomal location encoding 

25 genes of potential interest can be introduced into an artificial chromosome by 
methods provided herein. It may be desireable to include a second region of 
DNA within the artificial chromosome that provides a second flanking 
sequence to the region encoding genes of potential interest, to promote a 
double recombination event which would ensure transfer of the entire 

30 chromosomal region encoding genes of potential interest to the artificial 
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chromosome. The modified artificial chromosome, containing the DNA 
sequences capable of homologous recombination region can then be 
introduced into Arabidopsis cells and the homologous recombination event is 
selected. 

5 It is convenient to include a marker gene to allow for the selection of a 

homologous recombination event. The marker gene is preferably inactive 
unless activated by an appropriate homologous recombination event. For 
example, US 5,272,071, describes a method where an inactive plant gene is 
activated by a recombination event such that desired homologous 

lO recombination events can be easily scored. Similarly, US 5,501,967 

describes a method for the selection of homologous recombination events by 
activation of a silent selection gene first introduced into the plant DNA, the 
gene being activated by an appropriate homologous recombination event. 
Both of these methods can be applied to enable a selective process to be 

15 included in to select for recombination between an artificial chromosome and 
a plant chromosome. Once the homologous recombination event is 
detected, the artificial chromosome, once selected, is isolated and introduced 
into a recipient cell, for example, tobacco, corn, wheat or rice, and the 
expression of the newly introduced DNA sequences evaluated. Selection of 

20 recombinant events can take place in cell culture, or following seed formation 
and screening of seedling plants or seed itself. 

Phenotypic changes in the recipient plant cells containing the artificial 
chromosome, or in regenerated plants containing the artificial chromosome, 
allows for the evaluation of the nature of the traits encoded by the genes of 

25 interest, for example, Arabidopsis DNA, under conditions naturally found in 
plant cells, including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

Traits such as durable fungal or bacterial disease resistance, new oil and 

30 carbohydrate compositions, valuable secondary metabolites such as 



WO 02/096923 



PCT/US02/17451 



-163- 

phytosterols, flavonoids, efficient nitrogen fixation or mineral utilization, 
resistance to extremes of drought, heat or cold are all found within different 
populations of plant species and are often governed by multiple genes. The use 
of single gene transformation technologies does not permit the evaluation of the 
5 multiplicity of genes controlling many valuable traits. Thus, incorporation of 
these genes into artificial chromosomes allows the rapid evaluation of the utility 
of these genetic combinations in heterologous plant species. 

The large scale order and structure of the artificial chromosome provides 
a number of unique advantages in screening for new utilities or new phenotypes 

10 within heterologous plant species. The size of new DNA that can be carried by 
an artificial chromosome can be millions of base pairs of DNA, representing 
potentially numerous genes that may have different or new utility in a 
heterologous plant cell. The artificial chromosome is a "natural" environment 
for gene expression, the problems of variable gene expression and silencing 

15 seen for genes transferred by random insertion into a genome should not be 
observed. Similarly, there is no need to engineer the genes for expression, and 
the genes inserted would not need to be recombinant genes. Thus, transferred 
genes are fully expected to be expressed in the typical temporal and spatial 
fashion as observed in the species from where the genes were initially isolated. 

20 A valuable feature for these utilities is the ability to isolate the artificial 
chromosomes and to further isolate, manipulate and introduce into other cells 
artificial chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous recombination 
in plant cells can be used to isolate and identify many valuable crop traits. In 

25 addition to the use of artificial chromosomes for the isolation and testing of 
large regions of naturally occurring DNA, methods for the use of artificial 
chromosomes and cloned DNA are also contemplated. Similar to that described 
above, artificial chromsomes can be used to carry large regions of cloned DNA, 
including that derived from other plant species. 
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The ability to incorporate DNA elements into artificial chromosomes as 
they are being formed allows for the development of artificial chromosomes 
specifically engineered as a platform for testing of new genetic combinations, 
or "genomic" discoveries for model species such as Arabidopsis. Specific 
5 "recombinase" systems can be used in plant cells to excise or re-arrange genes; 
these same systems can be used to derive new gene combinations contained 
on an artificial chromosome. In this regard, it is contemplated that the use of 
site specific recombination sequences can have considerable utility in 
developing artificial chromosomes containing DNA sequences recognized by 

10 recombinase enzymes and capable of accepting DNA sequences containing 
same. The use of site-specific recombination as a means to target an 
introduced DNA to a specific locus has been demonstrated in the art and such 
methods can be employed. The recombinase systems can also be used to 
transfer the cloned DNA regions contained within the artificial chromosome to 

15 the naturally occurring plant chromosomes. 

Many site specific recombinases have been described in the literature 
(Kilby etal., Trends in Genetics, 9(12): 413-418, 1993). Among these are: 
an activity identified as R encoded by the pSR1 plasmid of Zygosaccharomyes 
rouxii, FLP encoded for the 2um circular plasmid from Saccharomyces 

20 cerevis/ae and Cre-lox from the phage P1 . 

The integration function of site specific recombinases is contemplated as 
a means to assist in the derivation of genetic combinations on artificial 
chromosomes. In order to accomplish this, it is contemplated that a first step 
of introducing site-specific recombinase sites into the genome of a plant cell in 

25 an essentially random manner is conducted, such that the plant cell has one or 
more site-specific recombinase recognition sequences on one or more of the 
plant chromosomes. An artificial chromosome is then introduced into the pant 
cell, the artificial chromosome engineered to contain a recombinase recognition 
site capable of being recognized by a site specific recombinase. Optionally a 

30 gene encoding a recombinase enzyme is also included, preferably under the 
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control of an inducible promoter. Expression of the site specific recombinase 
enzyme in the plant cell, either by induction of a inducible recombinase gene, 
or transient expression of a recombinase sequence causes a site-specific 
recombination event to take place, leading to the insertion of a region of the 
5 plant chromosomal DNA containing the recombinase recognition site into the 
recombinase recognition site of the artificial chromosome, forming an artificial 
chromosome containing plant chromosomal DNA. The artificial chromosome 
can be isolated and introduced into a heterologous host, preferably a plant host, 
and expression of the newly introduced plant chromosomal DNA can be 

10 monitored and evaluated for desirable phenotypic changes. Accordingly, 
carrying out this recombination with a population of plant ceils wherein the 
chromosomally located recombinase recognition site is randomly scattered 
throughout the chromosomes of the plant can lead to the formation of a 
population of artificial chromosomes, each with a different region of plant 

15 chromosomal DNA, each representing a new genetic combination. 

This particular method involves the precise site-specific insertion of 
chromosomal DNA into the artificial chromosome. This precision has been 
demonstrated in the art. For example, Fukushige and Sauer (Proc. Natl. Acad. 
Sci. USA, 89:7905-7909, 1992) demonstrated that the Cre-lox homologous 

20 recombination system could be successfully employed to introduce DNA into a 
predefined locus in a chromosome of mammalian cells. In this demonstration 
a promoter-less antibiotic resistance gene modified to include a fox sequence at 
the 5' end of the coding region was introduced into CHO cells. Cells were re- 
transformed by electroporation with a plasrnid that contained a promoter with 

25 a /ox sequence and a transiently expressed Cre recombinase gene. Under the 
conditions employed, the expression of the Cre enzyme catalyzed the 
homologous recombination between the /ox site in the chromosomally located 
promoter-less antibiotic resistance gene and the /ox site in the introduced 
promoter sequence leading to the formation of a functional antibiotic resistance 

30 gene. The authors demonstrated efficient and correct targeting of the 
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introduced sequence, 54 of 56 lines analyzed corresponded to the predicted 
single copy insertion of the DNA due to Cre catalyzed site specific homologous 
recombination between the fox sequences. 

The use of the same Cre-fox system has been demonstrated in plants 
5 (Dale and Ow, Gene 91:79-85, 1995) to specifically excise, delete or insert 
DNA. The precise event is controlled by the orientation of fox DNA sequences, 
in c/s the fox sequences direct the Cre recombinase to either delete (fox 
sequences in direct orientation) or invert {/ox sequences in inverted orientation) 
DNA flanked by the sequences, while in trans the fox sequences can direct a 

10 homologous recombination event resulting in the insertion of a recombinant 
DNA. Accordingly a fox sequence may be first added to a genome of a plant 
species capable of being transformed and regenerated to a whole plant to serve 
as a recombinase target DNA sequence for recombination with an artificial 
chromosome. The fox sequence may be optimally modified to further contain 

15 - a selectable marker which is inactive but can be activated by insertion of the fox 
recombinase recognition sequence into the artificial chromosome. 

A promoterless marker gene or selectable marker gene linked to the 
recombinase recognition sequence, which is first inserted into the chromosomes 
of a plant cell can be used to engineer a platform chromosome. A promoter is 

20 linked to a recombinase recognition site, in an orientation that allows the 
promoter to control the expression of the marker or selectable marker gene 
upon recombination within the artificial chromosome. Upon a site-specific 
recombination event between a recombinase recognition site in a plant 
chromosome and the recombinase recognition site within the the introduced 

25 artificial chromosome, a cell is derived with a recombined artificial chromosome, 
the artificial chromosome containing an active marker or selectable marker 
acitivity that permits the identification and or selection of the cell. 

The artificial chromosomes can be transferred to other plant species and 
the functionality of the new combinations tested. The ability to conduct such 

30 an inter-chromosomal transfer of sequences has been demonstrated in the art. 
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For example, the use of the Cre-lox recombinase system to cause a 
chromosome recombination event between two chromatids of different 
chromosomes has been shown 

Any number of recombination systems may be employed (see, U.S. 
5 provisional application Serial No. filed the same day herewith under attorney 
docket no. 24601 -P420). Such systems include, but are not limited to, 
bacteriafly derived systems such as the Int/aff system of phage lambda and the 
Glnlgix system. 

More than one recombination system may be employed, including, for 

10 example, one recombinase system for the introduction of DNA into an artificial 
chromosome, and a second recombinase system for the subsequent transfer of 
the newly introduced DNA contained within an artificial chromosome into the 
naturally occurring chromosome of a second plant species. The choice of the 
specific recombination system used will be dependent on the nature of the 

15 modification contemplated. 

By having the ability to isolate an artificial chromosome and in particular 
artificial chromosomes containing plant chromosomal DNA introduced via site- 
specific recombination and re-introduce the chromosome into other cells, 
particularly plant cells, these new combinations can be evaluated in different 

20 crop species without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same combination 
isolation and testing combinations of the genes in plants. The use of a site 
specific recombinase and artificial chromosomes also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA vectors 

25 and systems for manipulation and study. 

The artificial chromosomes can be engineered as platforms to accept 
large regions of cloned DNA, such as that contained in Bacterial Artificial 
Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). It is further 
contemplated, that as a result of the typical structure of amplification-based 

30 artificial chromosomes, such as, for example, SATACS (or ACes), containing 
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tandemly repreated DNA blocks, that more than cloned DNA sequence can be 
introduced by recombination processes. In particular recombination within a 
predefined region of the tandemly repreated DNA within the artifical 
chromosome provides a mechanism to "stack" numerous regions of cloned 
5 DNA, including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 
chromosomes and these combinations tested for functionality. In particular, it 
is contemplated that multiple YACs or BACs can be stacked onto an artificial 
chromsomes, the BACs or YACs containing multiple genes of complex 

lO pathways or mutlipe genetic pathways. The BACs or YACs are typically 
selected based on genetic information available within the public domain, for 
example from the Arabidopsis Information Management System 
(http://aims.cps.msu.edu/aims/index.html) or the information related to the plant 
DNA sequences available from the Institute for Genomic Research 

15 (http://www.tigr.org) and other sites known to those skilled in the art. 
Alternatively, clones can be chosen at random and evaluated for functionality. 
It is contemplated that combinations providing a desired phenotype can be 
identified by isolation of the artificial chromosome containing the combination 
and analyzing the nature of the inserted cloned DNA. 

20 In another embodiment of the methods provided herein for discovering 

genes associated with plant traits, the artificiial chromosome used to transfer 
plant DNA to a host cell for evaluation therein will contain large regions of plant 
DNA, in particular plant euchromatin, as a result of the process by which the 
artificial chromosome is produced. In particular, the artificial chromosome may 

25 be an amplification-based artificial chromosome, including, but not limited to: 
(1)a minichromosome arising from breakage of a dicentric chromosome, (2) an 
artificial chromosome containing one or more regions of repeating nucleic acid 
units wherein the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid, (3) an artificial chromosome 

30 containing one or more regions of repeating nucleic acid units wherein the 
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repeat region(s) is made up predominantly of euchromatic DNA or contains 
about 10%, 20%, 30%, 40%, 50%, 60%, 70% f 80%, 90% or greater than 
90% euchromatic DNA, (4) an artificial chromosome containing one or more 
regions of repeating nucleic acid units wherein the artificial chromosome is 
5 made up of substantially equivalent amounts of heterochromatin and 
euchromatin, (5) an artificial chromosome that containing one or more regions 
of repeating nucleic acid units having common nucleic acid sequences that 
represent euchromatic and heterochromatic nucleic acid and (6) a sausage-like 
structure that contains a portion or all of a euchromatin-containing arm of a 

lO plant chromosome. 

In these methods for discovering genes associated with plant traits, 
because the artificial chromosome used to transfer plant DNA to a host cell for 
evaluation therein is generated to already contain large amounts of plant DNA, 
in particular plant euchromatin, there is no need to introduce plant euchromatin 

15 into the artificial chromosomes, by homologous or site-specific recombination. 

4. Use of artificial chromosomes for preparation and screening of 
libraries 

Since large fragments of DNA can be incorporated into artificial 
chromosomes (ACs), they are well-suited for use as cloning vehicles that can 
20 accommodate entire genomes in the preparation of genomic DNA libraries, 
which then can be readily screened for functionality as described above or for 
specific gene sequences for further modification and study. For example, it is 
possible to use artificial chromosomes to prepare artificial chromosome libraries 
containing plant genomic DNA library useful in the identification and isolation 
25 of functional DNA components such as genes, centromeric DNA and telomeric 
DNA from a variety of different species of plants. 

The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

Example 1 

30 Generation of Arabidopsis protoplasts 
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Plant protoplasts are typically generated from plant cells following 
standard techniques (for example, Maheshwari et al., Crit. Rev. Plant Sci. 
74:149-1 78, 1 995; Ramulu eta/., Methods in Molecular Biology 111 227-242, 
1999). Typically plant protoplasts are prepared from fresh plant tissue, e.g., 
5 leaf, or can be prepared by converting cell suspension cultures to protoplasts 
by removal of the cell walls enzymatically. For production of Arabidopsis 
protoplasts, the methods of Karesh etal. {Plant Cell Reports 9: 575-578, 1 991 ) 
and Mathur etal. {Plant Cell Reports 7-4:21-226, 1995) were used to generate 
Arabidopsis suspension cultures by modifications thereof as described below. 
10 These cells were maintained in liquid culture and subcultured as required, 
usually between 7 and 10 days in culture. 

Establishment of suspension cultures 

Cell suspension cultures derived from root callus of Arabidopsis thaliana 
cv. Columbia, RLD and Landsburg I erecta'were used. Calli were induced from 
15 roots of 3 week-old seedlings on callus Induction medium containing MS basic 
media (Murashige and Skoog (1962) Physiol. Plant 75:473-497) with 3% 
sucrose, 0.5mg/l napthalene acetic acid (NAA), 0.05 mg/l Kinetin (Sigman 
Aldrich Canada). The cell suspension cultures were grown from the calli in 
liquid callus induction medium at 22°C with shaking at 120 rpm. They were 
20 subcultured every 7 days. 

Generation of protoplasts 

One gram of 4-5 day-old suspension culture was incubated in 6 ml 
enzyme solution containing 1% Cellulase 'Onozuka' R-10 and 0.25% 
Macerozyme R-10 in 35 g/l CaCI 2 -2H 2 0 (Hartmann etal. (1 998) Plant Mol. Biol. 

25 36:741 -754) and incubated at 22°C in the dark with shaking at 70 rpm for 15 
h. The protoplast mixture was poured through a 100>t/m nylon mesh sieve and 
centrifuged at 250xg for 5 min. The protoplasts were washed with 35 g/l 
CaCl 2 -2H 2 0 and resuspended in 10 ml floating medium containing B5 medium 
(Gamborg etal. (1968) Exp. Cell Res. ,50:151-158) with 144 g/l sucrose and 1 

30 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D). The protoplasts were centrifuged 
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at 80xg for 10 min r collected at the interface and used immediately for 
transfection. 

Example 2 

Generation of Tobacco Mesophyll Protoplasts 

5 Mesophyll protoplasts were generated from leaves of sterile plantlets of N. 

tabacum cv. Xanthi. The plantlets were grown aseptically on MSO medium (MS 
basal media, 3% sucrose, 0.05% morpholinoethanesulfonic acid (MES), 1 .0 
mg/l benzyl adenine (BA), 0.1 mg/l NAA and 0.8% agar, pH 5.8) at 22°C under 
a 16/8 h photoperiod (see also Bilang eta/. (1994) Plant Molecular Biology 

10 Manual A 7:1-6). Fully expanded leaves (2x4- cm) were cut in half, the main 
vein removed and the upper epidermis scored with parallel cuts. Leaf pieces 
were immersed in 6 ml enzyme solution containing 1.2% Cellulase 'Onozuka' 
R-10 and 0.4% Macerozyme R-10 in K4 medium (Nagy and Maliga (1976) Z. 
Pflanzenpysiol. 75:453-455) and incubated at 22 °C for 1 5 h without shaking. 

15 The protoplasts were purified by pouring through a 100//m nylon mesh sieve. 
Suspension of protoplasts was carefully overlayed with 1 ml W5 solution (Bilang 
etaL (1994) Plant Molecular Biology Manual A 1 'AS) and centrifuged at 80xg 
for lO min. Protoplasts were then resuspended in W5 solution at a density of 
1 x 1 0 6 protoplasts/ml and stored at 4°C for 1 to 2 hours prior to treatment, for 

20 example, DNA uptake or chromosome transfer. 

Example 3 

Production of Tobacco Protoplasts from Suspension Cultures 

Tobacco BY-2 protoplasts are prepared from suspension cultures according 
to the method of Nagata et aL [(1981) Molecular and General Genetics, 
25 754:161-165]. 

Example 4 

Generation of Brassica Hypocotyl Protoplasts 

Genotypes of Brassica napus, B. oleracea, B.juncea and B. carina ta may 
be used to generate protoplasts. Seeds of Brassica napus were 
30 surface-sterilized (for 2 min with 70% ethanol, then for 20 min with 2.4% 
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sodium hypochlorite containing one drop of Tween 20 per 1 0O ml). Seeds were 
rinsed thoroughly with sterile distilled water and grown aseptically on 
autoclaved germination medium (half-strength basal Murashige and Skoog's 
medium (MS), 1 % sucrose, 0.8% agar, pH 5.8). Unless otherwise indicated, 
5 the protoplast generation procedures were performed aseptically and solutions 
and media were filter-sterilized. Alternatively, protoplasts can be generated and 
cultured successfully from different explants using various protocol 
modifications (for example, Kao et al. (1991) Plant Science 75:63-72; Kao et 
al. (1990) Plant Cell Rep. S:31 1-315; Kao and Seguin-Swartz (1987) Plant Cell 
10 Tiss. Org. Cult. 70:79-90; Kao (1977) Mol. Gen. Genet. 150:225-230). 
Generation of Hypocotyl Protoplasts 

Hypocotyls were excised from 4- or 5 day-old seedlings grown aseptically 
in the dark with or without light exposure for a few hours prior to use. The 
explants were cut transversely into 2-5 mm pieces and incubated in enzyme 

15 solution (salts, vitamins and organic acids of Kao's medium (Kao (1977) MoL 
Gen. Genet. 750:225-230), 0.4 g/l CaCI 2 -2H 2 0, 13% sucrose, 1% 
Cellulase'Onozuka RIO', 0.1% Pectolyase Y23, pH 5.6) in petri dishes, in 
darkness, without agitation for 14-18 hours, then with agitation on a rotary 
shaker (ca. 50 rpm) for 1 5-30 min. 

20 The mixture was filtered through a 63 jjrn nylon screen into centrifuge 

tubes, and an equal volume of 17.5% sucrose was added to each tube. 
Following centrifugation (ca. lOOxg, 8 min), the protoplast band that formed at 
the top of each tube was collected. Protoplasts were washed 3 times by 
resuspension in wash solution [solution W5 of Menczel and Wolfe (1984, Plant 

25 Ceil Rep 5:196-198) at a reduced strength (0.8X)J followed by centrifugation 
at 100xg for 3-5 min and discarding the supernatant. 

Protoplasts were cultured in Kao's medium containing the salts, vitamins 
and organic acids with 30 g/l sucrose, 68.4 g/i glucose, 0.5 mg/l NAA, 0.5 mg/l 
BA, 0.5 mg/l 2,4-D, pH 5.7, at a density of 1 X 10 5 per ml and incubated at 

30 25°C, 16 h photoperiod, in dim fluorescent light (25 //Em 2 s 1 ). 
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After 5-8 days in culture, 1-1.5 ml of feeder medium containing the above 
medium except with 55.8 g/l glucose instead of 68.4- g/l, were added to each 
dish, and the dishes were placed under brighter fluorescent light (50juEm~ 2 s" 1 ). 
At about 14 days, 1-2 ml of medium were removed from each dish, and 2-3 ml 
5 of feeder medium containing basal B5 medium (Gamborg eta/. (1 968) Exp. Cell 
Res. 50-A 51-158), 3% sucrose, 3.8% glucose, 0.5 mg/l BA, 0.5 mg/l NAA, and 
0.5 mg/l 2,4-D, pH 5.7, were added. At about 21 days, if microcolonies have 
not yet formed, the cultures can be fed with the last feeder medium except with 
2.2% glucose instead of 3.8%. Protoplast cultures can be washed when 
10 necessary by adding new feeder medium, gently swirling petri dishes, allowing 
cells to settle, removing most of the supernatant and adding fresh medium to 
the dishes. 

At 3-5 weeks, microcolonies were embedded with medium containing a 1 :1 
mixture of the last feeder medium and proliferation medium which contains the 
15 components of the feeder medium with 0.9% glucose and 1.6% agarose to 
make a concentration of 0.8% in the final mixture. Cultures were incubated as 
described above in bright fluorescent light (80-100 //Em* 2 s" 1 ). After 10days-2 
weeks, green colonies were plated onto the regeneration medium. 

Example 5 

20 Preparation of a Transformation Vector Useful for the Induction of 

Plant Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by introducing 

nucleic acid, such as DNA, which can include an amplification-inducing DNA 

and/or a targeting DNA, for example rDNA or lambda DNA, into a plant cell, 

25 allowing the cell to grow, and then identifying from among the resulting cells 
those that include a chromosome with a structure that is distinct from that of 
any chromosome that existed in the cell prior to introduction of the nucleic acid. 
The structure of a PAC reflects amplification of chromosomal DNA, for example, 
segmented, repeat region-containing and heterochromatic structures. It is also 

30 possible to select cells that contain structures that are precursors to PACs, for 
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example, chromosomes containing more than one centromere and/or fragments 
thereof, and culture and/or manipulate them to ultimately generate a PAC within 
the cell. 

In the method of generating PACs, the nucleic acid can be introduced 
5 into a variety of plant cells. The nucleic acid can include targeting DNA and/or 
a plant expressable DNA encoding one or multiple selectable markers (e.g., DNA 
encoding bialophos (bar) resistance) or scorable markers (e.g., DNA encoding 
GFP). Examples of targeting DNA include, but are not limited to, N. tabacum 
rDNA intergenic spacer sequence (1GS) and Arabidopsis rDNA such as the 18S, 

10 5.8S, 26S rDNA and/or the intergenic spacer sequence. The DNA can be 
introduced using a variety of methods, including, but not limited to 
Agrobacter/um-m&d\aXe6 methods, PEG-mediated DNA uptake and 
electroporation using, for example, standard procedures according toHartmann 
eta/ [{1 998) Plant Molecular Biology 35:741]. The cell into which such DNA 

15 is introduced can be grown under selective conditions and can initially be grown 
under non-selective conditions and then transferred to selective media. The 
cells or protoplasts can be placed on pfates containing a selection agent to 
grow, for example, individual calli. Resistant calli can be scored for scorable 
marker expression. Metaphase spreads of resistance cultures can be prepared, 

20 and the metaphase chromosomes examined by FISH analysis using specific 
probes in order to detect amplification of regions of the chromosomes. Cells 
that have artificial chromosomes with functioning centromeres or artificial 
chromosomal intermediate structures, including, but not limited to, dicentric 
chromosomes, formerly dicentric chromosomes, minichromosomes, 

25 heterochromatin structures (e.g. sausage chromosomes), and stable self- 
replicating artificial chromosomal intermediates as described herein, are 
identified and cultured. In particular, the cells containing self-replicating artificial 
chromosomes are identified. 
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The DNA introduced into a plant cell for the generation of PACs can be 
in any form, including in the form of a vector. An exemplary, vector for use in 
methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
5 vectors, as exemplified by pAglla and pAgllb, containing a selectable marker, 
a targeting sequence, and a scorable marker were constructed using procedures 
well known in the art to combine the various fragments. The vectors can be 
prepared using vector pAg1 as a base vector and inserting the following DNA 
fragments into pAg1: DNA encoding /?-glucoronidase under the control of the 

10 nopaline synthase (NOS) promoter fragment and flanked at the 3' end by the 
NOS terminator fragment, a fragment of mouse satellite DNA and an N. 
tabacum rDNA intergenic spacer sequence (IGS). In constructing plant 
transformation vectors, vector pAg2 can also be used as the base vector. 
1. Construction of pAG1 

15 Vector pAg1 (SEQ. ID. NO: 1; see Figure 1) is a derivative of the 

CAMBIA vector named pCambia 3300 (Center for the Application of Molecular 
Biology to International Agriculture, i.e., GAMBIA, Canberra, Australia; 
www.cambla.org), which is a modified version of vector pCambia 1300 to 
which has been added DNA from the bar gene confering resistance to 

20 phosphinothricin. The nucleotide sequence of pCambia 3300 is provided in 
SEQ. ID. NO: 2. pCambia 3300 also contains a lacZ alpha sequence containing 
a polylinker region. 

pAg1 was constructed by inserting two new functional DNA fragments 
into the polylinker of pCambia 3300: one sequence containing an attB site and 

25 a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a 
SV4-0 polyA signal sequence, and a second sequence containing DNA from the 
hygromycin resistance gene (hygromycin phosphotransferase) confering 
resistance to hygromycin for selection in plants. Although the zeomycin-SV40 
polyA signal fusion is not expected to provide the basis for zeomycin selection 

30 in plant cells, it can be activated in mammalian cells by insertion of a functional 
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promoter element into the attBsite by site-specific recombination catalyzed by 
the Lambda att integrase. Thus, the inclusion of the attB-zeomycin sequences 
allows for evaluation of functionality of plant artificial chromosomes in 
mammalian cells by activation of the zeomycin resistance-encoding DNA, and 
5 provides an att site for further insertion of new DNA sequences into plant 
artificial chromosomes formed as a result of using pAg1 for plant 
transformation. The second functional DNA fragment allows for selection of 
plant cells with hygromycin. Thus, pAg1 contains DNA from the bar gene 
confering resistance to phosphinothricin, DNA from the hygromycin resistance 

10 gene, both resistance-encoding DNAs under the control of a separate 
cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAg1 is a binary vector containing Agrobacterium right and left T-DNA 
border sequences for use in Agrobacterium-med\ated transformation of plant 

1 5 cells or protoplasts with the DNA located between the border sequences, p Ag 1 
also contains the pBR322 Ori for replication in B.colL pAg1 was constructed 
by ligating M/7ortll/Ps*l-digested p3300attBZeo with Hind\ 1 1 IPst\ -digested 
pBSCaMV35SHyg as follows (see Figure 2). 
a. Generation of p3300attBZeo 

20 Plasmid pCambia 3300 was digested with Pstl/EcH 36 II and ligated with 

Pstl/Stul-d'igested pLITattBZeo (the nucleotide sequence of pLITattBZeo is 
provided in SEQ. ID. NO: 19 to generate p3300attBZeo which contains an attB 
site, a promoterless zeomycin resistance-encoding DNA flanked at the 3' end 
by a SV40 polyA signal, and a reconstructed Pst\ site. 

25 b. Generation of pBSCaMV35SHyg 

A DNA fragment containing DNA encoding hygromycin 
phosphotransferase flanked by the CaMV 35S promoter and the CaMV 35S 
polyA signal sequence was obtained by PCR amplification of plasmid pCambia 
1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 3). The primers 

30 used in the amplification reaction were as follows: 
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CaMV35SpolyA: 

5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEG. ID. NO: 4 
CaMV35Spr: 

5 '-CTAG AGC AGCTTG CC AAC ATGGTGG AG CA-3 ' SEQ. ID. NO: 5 
5 The 2 1 0O-bp PCR fragment was ligated with FcoRV-digested pBluescript II SK + 
(Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 
c. Generation of pAgl 

To generate pAgl, pBSCaMV35SHyg was digested with Hind\\\IPst\ and 
ligated with M>7c/lll/Psrl-digested p3300attBZeo. Thus, pAgl contains the 

1 0 pCambia 3300 backbone with DNA conferring resistance to phophinothricin and 
hygromycin under the control of separate CaMV 35S promoters, an attB- 
promoterless zeomycin resistance-encoding DNA recombination cassette and 
unique sites for adding additional markers, e.g., DNA encoding GFP. The attB 
site facilitates the addition of new DNA sequences to plant or animal, e.g., 

15 mammalian, artificial chromosomes, including PACs formed as a result of using 
the pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of DNAs 
containing a homologous att site. 
2. pAG2 

20 The vector pAg2 (SEQ. ID. NO: 6; see Figure 3) is a derivative of vector 

pAgl formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA signal, 
to pAgl. pAg2 was constructed as follows (see Figure 4-). A DNA fragment 
containing the NOS promoter was obtained by digestion of pGEM-T-NOS, or 

25 pGEMEasyNOS (SEQ. ID. NO: 7), containing the NOS promoter in the cloning 
vector pGEM-T-Easy (Promega Biotech, Madison, Wl, U.S.A.), with XbaMNcoX 
and was ligated to an Xba\INco\ fragment of pCambia 1 302 containing DNA 
encoding GFP (without the CaMV 35S promoter) to generate p1 302NOS (SEQ. 
ID. NO: 8) containing GFP-encoding DNA in operable association with the NOS 

30 promoter. Plasmid p1302NOS was digested with Sma\IBsi\N\ to yield a 
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fragment containing the NOS promoter and GFP-encoding DNA. The fragment 
was ligated with Pmel/fte/WI-digested pAg1 to generate pAg2. Thus, pAg2 
contains DNA from the bar gene confering resistance to phosphinothricin, DNA 
conferring resistance to hygromycin, both resistance-encoding DNAs under the 
5 control of a cauliflower mosaic virus 35S promoter, DNA encoding kanamycin 
resistance, a GFP gene under the control of a NOS promoter and the attB- 
zeomycin resistance-encoding DNA. One of skill in the art will appreciate that 
other fragments can be used to generate the pAg1 and pAg2 derivatives and 
that other heterlogous DNA can be incorporated into pAg1 and pAg2 derivatives 

lO using methods well known in the art. 

3. pAglla and pAgllb transformation vectors 

Vectors pAgila and pAgllb were constructed by inserting the following 
DNA fragments into pAg1: DNA encoding /?-gfucoronidase, the nopafine 
synthase terminator fragment, the nopaline synthase (NOS) promoter fragment, 

15 a fragment of mouse satellite DNA and an N. tabacum rDNA intergenic spacer 
sequence (IGS). The construction of pAglla and pAgllb was as follows (see 
Figure 5). 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. NO: 9); 
see also GenBank Accession No. YQ8422; see also Borysyuk et a/. (2QOO) 

20 Nature Biotechnology 73:1303-1306; Borysyuk et al. (1997) Plant Mof. 
£/o/.35:655-660; U.S. Patent Nos. 6,100,092 and 6,355,860) was obtained by 
PCR amplification of tobacco genomic DNA. The IGS can be used as a 
targeting sequence by virtue of its homology to tobacco rDNA genes; the 
sequence is also an amplification promoter sequence in plants. This fragment 

25 was amplified using standard PCR conditions (e.g., as described by Promega 
Biotech, Madison, Wl, U.S.A.) from tobacco genomic DNA using the primers 
shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 10) and 
30 NTIGS-RI 
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5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C- 3' (SEQ ID No. 11) 
Following amplification, the fragment was cloned into pGEM-T Easy to give 



5 No. V00846; and SEQ ID No. 1 2) was amplified via PCR from pSAT-1 using the 
following primers: 
MSAT-F1 

5'- AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 13) 
and 

10 MSAT-Ri 

5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 14) 
This amplification added a SacU and a Hind\\\ site at the 5'end and a SacU site 
at the 3' end of the PCR fragment. This fragment was then cloned into the 
SacU site in plGS-1 to give pMIGS-1 , providing a eukaryotic centromere-specific 

15 DNA and a convenient DNA sequence for detection via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter (GenBank 
Accession No. U09365; SEQ ID No. 15), E. coif /^-glucuronidase coding 
sequence (from the GUS gene; GenBank Accession No. S694-14; and SEQ ID 

20 No. 1 6), and the nopaline synthase terminator sequence (GenBank Accession 
No. U09365; SEQ ID No. 18). The NOS promoter in pGEM-T-NOS was added 
to a promoterless GUS gene in pBlueScript (Stratagene, La Jolla, CA # U.S.A.) 
using NotMSpeX to form pNGN-1 , which has the NOS promoter in the opposite 
orientation relative to the GUS gene. 

25 pMIGS-1 was digested with NotMSpeX to yield a fragment containing the 

mouse major satellite DNA and the tobacco IGS which was then added to Not]- 
digested pNGN-1 to yield pNGN-2. The NOS promoter was then re-oriented to 
provide a functional GUS gene, yielding pNGN-3, by digestion and religation 
with Spe\. Plasmid pNGN-3 was then digested with Hind\\\, and the Hind\\\ 

30 fragment containing the /^-glucuronidase coding sequence and the rDNA 



pIGS-L 



A fragment of mouse satellite DNA (Msatl fragment; GenBank Accession 
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intergenic spacer, along with the Msat sequence, was added to pAG~1 to form 
pAglla, using the unique HindXW site in pAg1 located near the right T-DNA 
border of pAg1, within the T-DNA region. 

Another plasmid vector, referred to as pAgllb, was also recovered, which 
5 contained the inserted HindXXX fragment in the opposite orientation relative to 
that observed in pAglla. Thus, pAglla and pAgllb differ only in the orientation 
of the HindXXX fragment containing the mouse major satellite sequence, the GUS 
DNA sequence and the IGS sequence (see Figure 6). The nucleotide sequence 
of pAglla is provided in SEQ. ID. NO: 21. 

10 Vectors pAg1, pAg2, pAglla and pAgllb, as well as similarly designed 

vectors containing a recombination site and a promoter {e.g., plaht or animal 
promoter), and possibly other regulatory sequences, in operable association with 
DNA encoding a protein or other product for the expression in a host cell, such 
as a plant or animal cell, can be used in the transfer of any protein (or other 

1 5 productj-encoding nucleic acid of interest into a cell for expression thereof. For 
example, any protein (or other productj-encoding nucleic acid of interest (in 
operable association with transcriptional regulatory suitable for use in a 
particular host cell) can be inserted into any of the vectors pAg1, pAg2, pAglla 
and pAgllb and thereby incorporated into a plant, animal or other artificial 

20 chromosome, particularly a platform artificial chromosome ACes, as desribed 
herein. 

Example 6 

Agrobacterium-Modlated Transformation of Plant Cells 
Plant cells were transformed via Agrobacterium- mediated transformation 
25 according to standard procedures (see, for example, Horsch eta/. (1 988) Plant 
Molecular Biology Manual, A5A-9, Kluwer Academic Publisher, Dordrecht, 
Belgium). Briefly, Agrobacterium strain GV 3101/pMP90 (see Koncz and Schell 
(1986) Molecular and General Genetics 204:383-396) was transformed with 
pAglla and pAgllb (see Example 5) by heat shock, and the plasmid integrity of 
30 pAglla and pAgllb after transformation was verified by HindXW digest pattern. 
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pAglla/pMP90 or pAgllb/pMP90 were cultured in 5 ml AB minimum medium 
(Horsch eta/. (1988) Plant Molecular Biology Manual, Kluwer Academic 

Publisher, Dordrecht, Belgium) containing 25 //g/ml kanamycin and 25 //g/ml 
gentamycin at 28 °C for two days. 
5 Leaf disks of tobacco and Arabidopsis and root segments of Arabidopsls 

were prepared as follows: tobacco leaves from 3 to 4 week-old explants were 
cut into 1 cm in diameter, and Arabidopsls leaves were taken from 3 week-old 
seedlings and transversely cut in two halves. Roots of 3 week-old Arabidopsis 
were excised into segments of 1 cm in length. Cocultivation was carried out 

10 by immersing leaf disks or root segments in bacterial culture for 2 minutes and 
then transferring the infected tissues to culture medium without antibiotics for 
2 days at 22 °C for 16-hours/day under cool white fluorescent light. The leaf 
disks of tobacco and Arabidopsis were cultured on MS104 medium (MS, 3% 
sucrose, 0.05% MES, 1.0 mg/l BA, 0.1 mg/l NAA and 0.8% agar, pH 5.8) and 

15 root segments on callus-inducing medium, CIM 0.5/0.05 (B5, 2% glucose, 
0.05% MES, 0.5 mg/l 2,4-D, 0.05 mg/l kinetin and 0.8% agar, pH 5.8). 

The transformed leaf disks and root segments were then transferred to 
selection medium of MSI 04 or CIM 0.5/0.05, respectively, containing 20 mg/l 
hygromycin and 300 mg/l Timentin for the elimination of Agrobacterium. The 

20 selection medium was refreshed every two weeks and green shoots 
regenerated. Plants were analyzed for the expression of the DNA encoding GUS 
by standard histochemical and fluorescent assays and evidence of amplification 
of the inserted DNA by quantitative PCR. Numerous plants were obtained that 
expressed high levels of GUS, and multiple copies of the GUS gene were 

25 observed by Fluorescent In Situ Hybridization (FISH) and PCR analysis. Thus, 
amplification the chromosomal regions containing the inserted DNA was 
observed. One of skill in the art will appreciate that GUS expression, or the 
expression of any other gene, can be assessed using methods well known in the 
art. 

30 Example 7 
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Transfection and culture of Arabidopsis protoplasts 

E. coii strain Stb14 (Gibco Life Sciences) was transformed with pAglla, 
pAgllb, and one of two targeting plasmids containing the rDNA repeat sequence 
from Arabidopsis (plasmid pJHD-1 4A or the 26S rDNA from Arabidopsis plasmid 
5 pJHD2-19A, as described by Doelling et al. [(1993) Proc. Natl. Acad. Sci. 
U.S.A. 90:7528-7532]) via electroporation according to standard procedures. 
A single colony was grown up in 250 ml LB medium containing 50 //g/ml 
kanamycin (for selection based on the kanamycin resistance-encoding DNA in 
pAglla and pAgllb) or 50 //g/ml ampicillin (for selection based on the ampicillin 

lO resistance-encoding DNA in pJHD-14A & pJHD2-19A) and cultured at 30?C 
with shaking at 225 rpm for 16 hours. The plasmids were isolated according to 
standard procedures well known in the art. The structural integrity of the 
plasmids was checked by restriction digestion pattern, and the plasmids were 
linearized with restriction enzymes. Plasmids were sterilized with chloroform 

15 and 70% ethanol before use for transfection. 

Arabidopsis protoplasts were resuspended in the culture medium (see 
Example 1) at a density of 2 x lO 6 protoplasts/ml. A 30O p\ protoplast 
suspension was pipetted into a 1 5 ml tube, and 30 jj\ of plasmid (pAglla or 
pAgllb) and targeting DNA (pJHD-14A or pJHD2-19A) was added containing 

20 10 pg plasmid and 100//g targeting sequence followed immediately by slowly 
adding 300 jj\ of 10% PEG. The targeting plasmids were included in the 
transfection procedure in order ensure that the amount of rDNA targeting DNA 
(i.e., tobacco rDNA from pAglla or b and Arabidopsis DNA from the targeting 
vectors) was sufficient to effect recombination of the introduced DNA at a 

25 homologous site in an Arabidopsis chromosome. DNA was typically used in a 
ratio of 10:1, targeting DNA (pJHD-14A or pJDH2-19A, or Lambda DNA) to 
plasmid DNA (pAglla or pAgllb, or a selectable marker plasmid), or in a ratio of 
5:1 . Generally, the number of base pairs of targeting DNA to be sufficient for 
insertion into a plant chromosome is at least about 50 bp, or about 60 bp, or 

30 about 70 bp, or about 80 bp, or about 90 bp, or about 1 00 bp, or about 1 50 
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bp, or about 200 bp, or about 300 bp, or about 400 bp, or about 50O bp, or 
about 600 bp, or about 700 bp, or about 800 bp, or about 900 bp, or about 1 
kb, or about 2 kb or about 3 kb, or about 4 kb, or about 5 kb, or about 6 kb, 
or about 7 kb, or about 8 kb, or about 9 kb, or about 10 kb or more. The 
5 amount and length of targeting DNA sufficient to effect introduction into a 
chromosome can be determined empirically and can vary for different plant 
species. 

The mixture was shaken gently, and immediately 300 //I of 10% PEG 
solution was added slowly with gentle shaking. The protoplast mixture was 

10 incubated at 22 °C for 10-15 min with several cycles of gentle shaking. DNA 
uptake was quenched by the addition of 5 ml 72.4 g/l Ca(N0 3 ) 2 . The 
protoplasts were then centrifuged at 80xg for 7 min and resuspended in culture 
medium. For selection, 10 to 40 mg/f hygromycin was added to protoplast 
cultures 14 days aftertransfection, and the culture medium was refreshed every 

15 7 days. The protoplast cultures could also be selected after embedding in 0.6% 
agarose by transferring to a culture medium containing 20 mg/l hygromycin. The 
cultures were incubated for 14 days or longer at 22°C. 

The Arabidopsis protoplasts were analyzed for the presence and 
expression of the DNA encoding GUS. Recovered microcalli strongly expressed 

20 GUS and were resistant to selective agents, indicating amplification of the 
inserted DNA. Afternatively, the transfection of Arabidopsis protoplasts can 
be conducted without using targeting DNA sequences since pAglla and pAgMb 
include a region of rDNA (i.e. the tobacco rDNA IGS) that can act as a targeting 
sequence as long as a sufficient amount of pAglla/b plasmid is used in the 

25 transfection procedure. Example 8 

Transfection and Culture of Tobacco Protoplasts 
As described in Example 7, E. co/i strain Stbl4 was transformed with pAglla, 
pAgllb, pJHD-14A (targeting DNA) and pJHD2-19A (targeting DNA) via 
electroporation, and plasmid DNA was recovered and linearized with restriction 



WO 02/096923 



PCT/US02/17451 



-184- 

enzymes. Plasmids were sterilized with chloroform and 70% ethanol before use 
for transfection. 

The tobacco protoplasts (see Examples 2 and 3) were resuspended in the 
culture medium (see Example 2) at a density of 2 x 10 6 protoplasts/ml. A 300 
5 jj\ protoplast suspension was pipetted into a 15 ml tube, and 30//I of plasmid 
and targeting DNA was added as described in Example 7. The mixture was 
shaken gently, and immediately 300 jj\ of 10% PEG solution was added slowly 
with gentle shaking. The tobacco protoplast mixture was incubated at 22°C 
for 10-15 min with several cycles of gentle shaking. DNA uptake was 

10 quenched by the addition of 5 ml 72.4 g/L Ca(N0 3 ) 2 - The protoplasts were then 
centrifuged at 80xg for 7 min and resuspended in culture medium. 

The recovery of viable tobacco protoplasts following DNA uptake ranged 
from 65-75% following treatment. Typically greater than 35% of the 
protoplasts initiated cell division within 7 days of treatment. Protoplast cells 

15 were analyzed for gene expression (in this case for the expression of the 
reporter DNA GUS, but alternatively, the expression of other genes can be 
monitored). Between 4% and 6% of the recovered cells exhibited GUS 
expression. 

The protoplasts were subject to selection procedures to recover 
20 transformed cells. For selection of tobacco cells, 10 to 40 mg/l hygromycin 
was added to protoplast cultures 10-14 days after transfection, and the culture 
medium was refreshed every 7 days. Leaf disc selection was performed in the 
presence of 40 mg/l hygromycin. Transformed microcalli were recovered and 
analyzed for the expression of the GUS reporter gene. GUS positive calli were 
25 isolated and subjected to FISH analysis (see Example 13). Plant cells that 
exhibited amplification of the inserted DNA were identified. 

Example 9 

Transfection and Culture of Brassica Protoplasts 

Brassica protoplasts (see Example 4), following the final washing step 
30 after filtering through a 63 jc/m nylon screen and centrifugation, are collected 
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and used for DNA transfection as described in Example 8. Brassica protoplast 
cultures following DNA uptake or transformation by Agrobacterium can be 
selected with either hygromycin or gluf osinate ammonium in liquid culture or in 
embedded semi-solid cultures. The effective concentration of hygromycin is 10 
5 to 40 mg/l for 2 to 4 weeks or continuously, whereas that for glufosinate 
ammonium is 2 to 60 mg/l for 5 days to 2 weeks. Selection can impede growth, 
and additional transfers to similar media may be required. 

Example 10 
Plant Regeneration from Brassica Protoplasts 

lO Colonies of Brassica protoplasts (1 mm or larger in diameter) are plated 

onto regeneration medium (basal Murashige and Skoog's medium, 1 % sucrose, 
2 mg/l BA, 0.01 mg/l ISJAA, 0. 8% agarose, pH 5.6). Cultures are incubated 
under the conditions described in Example 4. Cultures are transferred onto 
fresh regeneration medium every 2 weeks. Regenerated shoots are transferred 

15 onto autoclaved rooting medium (basal Murashige and Skoog's medium, 1% 
sucrose, O.I mg/l NAA, 0.8% agar, pH 5.8) and incubated under dim 
fluorescent light (25 //Em' 2 s" 1 ). Plantlets are potted in a soil-less mix (for 
example, Terra-lite Redi-Earth, W.R. Grace & Co., Canada Ltd., Ajax, Ontario) 
containing fertilizer (Nutricote 1414-14 type 100, Plant Products Co. Ltd, 

20 Brampton, Ontario) and grown in a growth room (20°C/15°C, 16 h 
photoperiod, 100-140//Em~ 2 s" 1 ) with fluorescent and incandescent light at soil 
level. Plantlets are covered with transparent plastic cups for one week to allow 
for acclimatization. 

Example 11 

25 Isolation of Nuclei from Protoplasts 

To facilitate analysis, plant cells can be subjected to nuclei isolation, and 
the isolated nuclei can be analyzed by FISH or PCR. To isolate the nuclei, 
protoplast calli were reprotoplasted according to the procedure of Mathur etal. 
with modifications (see Mathur etal. Plant Cell Report (1995) 14: 221-226). 
30 The protoplast calli were digested with 1.2% Cellutase 'Onozuka' R-10 and 



WO 02/096923 



PCT/US02/17451 



-186- 

0.4% w/v Macerozyme R-10 in nuclei isolation buffer (10 mM MES-pH 5.5, 
0.2M sucrose, 2.5 mM EDTA, 2.5 mM DTT, 0.1 mM spermine, 10 mM NaCI, 
10 mM KCI and 0.15% Triton X-100) for 3 hours. After centrifugation at 80 
x g for 10 minutes, the pellets of protoplasts were resuspended in hypertonic 
5 buffer of 1 2.5% W5 solution (Hinnisdaels et al. (1 994-) Plant Molecular Biology 
Manual G2:1-13, Kluwer Academic Publisher, Belgium) for 10 minutes. To 
promote disruption of protoplasts, the protoplast suspension was forced through 
a syringe needle four times. The disrupted protoplasts were filtered through 5 
//m meshes to remove debris and centrifuged at 200 x g for 10 min. By 

10 repeated washing of the pellet in a nuclei isolation buffer containing 
phenylmethylsulfonylfluoride (PMSF) and centrifugation at 200 x g for 10 
minutes, nuclei were collected as a white pellet freed from cytoplasm 
contamination and cellular debris. Samples were fixed in 3:1 rnethanoliglacial 
acetic acid and were analyzed by FISH. 

15 Example 12 

Mitotic Arrest of Plant Cells for Detection of Amplification and 
Artificial Chromosome Formation 

In general, plant cells or protoplasts are typically cultured fortwo or more 

generations prior to mitotic arrest. Typically, 5 jt/g/ml colchicine is added to the 

20 cultures for 12 hours to accumulate mitotic plant cells. The mitotic cells are 
harvested by gentle centrifugation. Alternatively, plant cells (grown on plastic 
or in suspension) can be arrested in different stages of the cell cycle with 
chemical agents other than colchicine, such as, but not limited to, hydroxyurea, 
vinblastine, colcemid or aphidicolin or through the deprivation of nutrients, 

25 hormones, or growth factors. Chemical agents that arrest the cells in stages 
other than mitosis, such as, but not limited to, hydroxyurea and aphidicolin, are 
used to synchronize the cycles of all cells in the population and are then 
removed from the cell medium to allow the cells to proceed, more or less 
simultaneously, to mitosis at which time they can be harvested to disperse the 

30 chromosomes. 
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Example 13 

Detection of Amplification and Artificial Chromosome Formation by 
Fluorescence in situ hybridization (FISH) 

A variety of plant cells can analyzed by fluorescence in situ hybridization 

5 (FISH) methods (Fransz eta/. (1996) Plant J. 5:421-430; Fransz eta/. (1998) 

P/ant J. /3:867-876; Wilkes eta/. (1995) Chromosome Research 3:466-472; 

Busch et aA ( 1 994) Chromosome Research 2: 1 5-20; Nkongolo (1 993) Genome 

35:701-705; Leitch et a/. (1994) Methods in Molecu/ar Bio/ogy 23:177-185; 

Murata et aL. (1997) P/ant J. 72:31-37) to identify amplification events and 

10 artificial chromosome formation. 

FISH is used to detect specific DNA sequences on chromosomes, in 
particular to detect regions of plant chromosomes that have undergone 
amplification as a result of the introduction of heterologous DNA as described 
herein, or to detect artificial chromosome formation in plant cells. FISH 

15 chromosome spreads of Arabidopsis and tobacco plant cells into which 
heterologous DNA has been introduced are generated using colchicine or similar 
cell cycle arresting agents and \/anous DNA probes (e.g. rDNA probe, Lambda 
DNA probe, selectable marker probe). The cells are analyzed for the presence 
of amplified regions of chromosomes, in particular amplification of the rDNA 

20 regions, and those cells exhibiting amplification are further cultured and 
analyzed for the formation of artificial chromosomes. 

The chromosomes of plant cells subjected to introduction of heterologous 
DNA and growth to generate artificial chromosomes can also be analyzed by 
scanning electron microscopy. Preparation of mitotic chromosomes for 

25 scanning electron microscopy can be performed using methods known in the 
art (see, e.g., Sumner (1991) Chromosome f 00:4-1 0-41 8). The chromosomes 
can be observed, for example, with a Hitachi S-800 field emission scanning 
electron microscope operated with an accelerating voltage of 25kV. 
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Example 14- 

Detection of Amplification and Artificial Chromosome Formation by 
Idu Labeling of Chromosomes 

The structure of the chromosomes in plant cells can be analyzed by labeling 

5 the chromosomes with iododeoxyuridine (IdU), or other nucleotide analog, and 

using an IdU-specif ic antibody to visualize the chromosome structure. Plant cell 

cultures selected following introduction of heterologous DNA are labeled with 

IdU following standard protocols (Fujishige and Taniguchi (1998) Chromosome 

Research 6V61 1-619; Yanpaisan et al. (1998) Biotechnology and Bioengineering, 

10 55:515-528; Trick and Bates (1996) Plant Cell Reports, 75:986-990; Binarova 
et al. (1993) Theoretical and Applied Genetics, 57:9-16; Wang et al. (1991) 
Journal of Plant Physiology, 755:200-203). Plant cells in culture, typically 
suspension culture, are used. A series of sub-cultures are initiated, and IdU 
labeling is performed as described above. Cells are allowed to incorporate IdU 

15 for up to a week, depending on the doubling time of the culture. Labeled 
chromosomes can be detected in plant cells (Fujishige and Taniguchi (1998) 
Chromosome Research 6:611-619; Binarova et al (1993) Theoretical and 
Applied Genetics 57:9-16) and in mammalian cells (Gratzner and Leif (1981) 
Cytometry 7: 385-393) using procedures well known in the art. IdU-iabeled 

20 chromosomes are detected by immunocytochemical techniques. An anti-ldU 
fluorescein isothiocyanate (FITC)-conjugated B44 clone antibody (Becton 
Dickinson) is used to bind the IdU-DNA adduct in the DNA and is detected by 
fluorescence microscopy (490 nm excitation, 519 nm emission). Analysis of 
labeled chromosomes reveals the presence of amplified DNA regions and the 

25 formation of artificial chromosomes. 

Example 15 

Isolation of Metaphase Chromosomes from Protoplasts 

Artificial chromosomes, once detected in plant cells, may be isolated for 
transfer to other organisms and in particular other plant species. Several 
30 procedures may be used to isolate metaphase chromosomes from mitotic— 
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arrested piant cells, including, but not limited to, a pofyamine-based buffer 
system (Cram et al. (1990) Methods in Cell Biology 33:377-3821), a modified 
hexylene glycol buffer system (Hadlaczky et aL (1982) Chromosoma 
56:643-65), a magnesium sulfate buffer system (Van den Engh eta/. (1988) 
5 Cytometry 3:266-270 and Van den Engh et aL (1984) Cytometry 5: 1 08), an 
acetic acid fixation buffer system (Stoehr et aL (1982) Histochemistry 
74:57-61), and a technique utilizing hypotonic KCI and propidium iodide (Cram 
et aL (1994) XVII meeting of the International Society for Analytical Cytology, 
October 1 6-21 , Tutorial IV Chromosome Analysis and Sorting with Commerical 

10 Flow Cytometers; Cram et a/. (1 990) Methods in Cell Biology 33:373; de Jong 
eta/. (1999) Cytometry 35: 1 29-1 33). 

In an exemplary procedure, a hexylene glycol buffer is used to isolate plant 
chromosomes from mitotic-arrested plant cells that have been converted to 
protoplasts (Hadlaczky eta/. (1 982) Chromosoma 66:643-659). Chromosomes 

15 are isolated from about lO 6 mitotic cells re-suspended in a glycine-hexylene 
glycol buffer (100 mM glycine, 1 % hexylene glycol, pH 8.4-8.6, adjusted with 
a solution of saturated Ca(OH) 2 ) supplemented with 0.1 % Triton X-100 (GHT 
buffer). The cells are incubated for 10 minutes at 37 °C, and the chromosomes 
are purified by differential centrif ugation to pellet the nuclei (200xg for 20 min) 

20 and sucrose gradient centrifugation (5-30% sucrose, 5600xg for 60 min, 
0-4°C). To avoid proteolytic degradation of chromosomal proteins, 1 mMPMSF 
(phenylmethylsulfonylfluoride) is used in the presence of 1 % isopropyl alcohol. 
The proteins can be extracted from the isolated chromosomes using dextran 
sulfate-heparin (DSH) extraction, and the chromosomes can be visualized via 

25 electron microscopy using techniques known in the art (Hadlaczky eta/. (1 982) 
Chromosoma (BerL) 56:643-659; Hadlaczky et aL (1981) Chromosoma (BerLJ 
6 7:537-555). Additionally, modifications of these procedures, including, but 
not limited to, modification of the buffer composition (Carrano et aL (1979) 
Proc. NatL Acad. ScL U.S.A. 76:1382-1384) and variation of the centrif ugation 
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time or speed, to accommodate different plant species can be implemented by 
any skilled artisan. 

Example 16 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
5 Mammalian Artificial Chromosomes into a Dicot Plant: Arabidopsis 

One method of delivery of mammalian artificial chromosomes (MACs) into 

plant cells is the formation of microcells containing murine MACs and the 

CaP0 4 -mediated uptake or the PEG-mediated fusion of these microcells with 

plant protoplasts. In this example, microcells and plant protoplasts, such as but 

10 not limited to tobacco and Arabidopsis protoplasts, were mixed (in a series of 
25:1, 10:1, 5:1, or 2:1 microcells:protoplasts ratio) and fusion was observed. 
Protocols for the formation of microcells are known in the art and are described, 
for example, in US Patent Nos. 5,240,840, 4,806,476 and 5,298,429 and in 
Fournier Proc. NatL Acad. Set. U.S.A. (1981) 7^:6349-6353 and Lambert eta/. 

15 Proc. NatL Acad. ScL U.S.A. (1991) 88: 5907-5912. The murine microcells 
can be labeled with Idu or the IVIACs stained with a specific dye such as, but 
not limited to, e.g., propidium iodide or DAPI, prior to fusion with plant 
protoplasts including, but not limited to, Arabidopsis and tobacco protoplasts, 
to facilitate detection of the presence of IVIACs in the protoplasts. 

20 In this example, MACs were introduced into Arabidopsis cells using 

microcell-PEG mediated fusion. Microcells were, formed from murine cells 
containing an artificial chromosome (see U.S. Patent No. 6,077,697) and were 
fused with freshly prepared Arabidopsis protoplasts in a ratio of 10:1, 
microcells to protoplasts. Fusion occurred in the presence of 25% PEG 6000, 

25 204 mM CaCI 2 , pH 6.9 within the first 5 minutes of mixing. Typically less than 
about one minute of mixing is required to observe fusion between microcells 
and protoplasts. Fused cells were washed with 240 mM CaCI 2 , then floated on 
top of a solution of 204mM sucrose in B5 salts. Ceils were then transferred to 
cell suspension culture media (MS, 87mM sucrose, 2.7 //M napthalene acetic 

30 acid, 0.23 jjM kinetin, pH 5.8). Empirical observations can be used to 
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determine the optimal concentration and composition of PEG and the 
concentration of calcium that provides the highest degree of fusion with the 
feast toxicity. 

Fused protoplasts were allowed to grow for one or more generations. 
5 The presence of a mouse chromosomal sequence, including MACs, was 
demonstrated by southern hybridization with MAC probes, by FISH analysis and 
by PCR analysis using, for example, satellite sequences known to exist on the 
MAC chromosome. Thus, the mouse sequences were detected in the 
Arabidopsis protoplasts. 

10 To further demonstrate the transfer of mouse chromosomal sequence to 

Arabidopsis protoplasts, Arabidopsis plant cell nuclei were isolated according 
to Example 1 1 and were subjected to FISH analysis according to Example 13, 
using the mouse major satellite DNA (SEQ ID No. 12). A portion of the nuclei 
contained a significant signal using the mouse major satellite DNA, indicating 

15 successful transfer of at least a mouse chromosome and/or MAC to the 
Arabidopsis nuclei. 

Similarly, PACs may be introduced into Arabidopsis protoplasts using 
PEG- and/or calcium-mediated fusion procedures. Generation of 
microprotoplasts and protoplasts can be conducted as described, for example, 

20 in Example 1 . Microprotoplasts formed from plant cells containing a plant 
artificial chromosome are fused with freshly prepared Arabidopsis protoplasts, 
for example, in a ratio of 10:1, microprotoplasts to protoplasts. Protoplasts 
from other plants, including but not limited to, tobacco, wheat, maize and rice, 
can also be used as the recipient of MACs and/or PACs. Fused protoplasts are 

25 recovered and allowed to grow for one or more generations. The presence of 
the transferred PACs can be analyzed using methods such as, for example, 
those described herein (including Southern hybridization with PAC probes, FISH 
analysis and PCR analysis using DNA sequences specific to the PAC). 



WO 02/096923 



PCT/US02/17451 



-192- 
Example 17 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
Mammalian Artificial Chromosomes into a Second Dicot Plant: Tobacco 

MACs were introduced into tobacco cells using microcell-PEG mediated 

5 fusion using the same microcells, MAC, and protocol as described in Example 

16. Microcells were formed from murine cells containing an artificial 

chromosome and were fused with freshly prepared tobacco BY-2 protoplasts in 

a ratio of 10:1, microcells to protoplasts. Fusion occurred in the presence of 

20% PEG 4000 and 100-200 mM calcium chloride. Empirical observations are 

10 used to determine the optimal concentration and composition of PEG and the 

concentration of calcium that provides the highest degree of fusion with the 

least toxicity. 

DAPI staining of the microcells (e.g. by preincubation of the microcells 
with DAPI by adding DAPI to the microcells to a final concentration of 1 A/g/ml) 

15 allowed visualization of the fusion and transfer of the chromosomes to the 
tobacco protoplasts. Fused protoplasts were recovered and allowed to grow for 
one or more generations. The fused protoplasts can be analyzed for the 
presence of a MAC in a number of ways, including those described herein. 
Fused tobacco cell nuclei were isolated from tobacco protoplasts that had been 

20 fused with microcells according to Example 1 1 and were subjected to FISH 
analysis according to Example 13, using the mouse major satellite DNA (SEQ 
ID No. 12). Numerous nuclei were found to have incorporated a mouse 
chromosome. 

Example 18 

25 Transfer of isolated Artificial Chromosomes by Lipid-Mediated Transfer 

into a Monocot Plant: Rice 

Isolated murine artificial chromosomes (MACs) prepared by sorting 

through a FACS apparatus (de Jong etal. Cytometry (1 999) 35:129-133) were 

transferred into rice plant protoplasts by cationic lipid-mediated transfection of 

30 the purified MAC. Purified MACs (see Example 15 and U.S. Patent No. 

6,077,697) were mixed with Lipofect AMINE 200O (Gibco, Md, USA) as follows. 
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Typically, 1 5 jj\ of Li pof ect AM INE 2000 were added to 1 X 10 6 artificial 
chromosomes in liquid buffer, the solution allowed to complex for up to three 
hours, and then the solution was added to freshly prepared 1 X 10 5 rice 
protoplasts prepared using standard protoplast methods well known in the art. 
5 The uptake of the lipid-complexed artificial chromosome was monitored by 
adding to the mixture of protoplasts and purified artificial chromosomes a 
fluorescent dye that stains DNA. Microscopic examination of the 
protoplast/artificial chromosome mixture overthe next several hours allowed the 
visualization of the artificial chromosome being transported across the 

lO protoplast cellular membrane and the presence of the readily identifiable MAC 
in the cytoplasm of the rice plant cell. 

The same procedure as described in this Example for cationic lipid- 
mediated transfer of an isolated MAC into rice protoplasts can be used to 
transfer isolated MACs, as well as PACs, into rice and other plant protoplasts, 

15 including but not limited to, tobacco, wheat, maize and Arabidopsis. Fused 
protoplasts are recovered and allowed to grow for one or more generations. 
The presence of the transferred MACs and PACs can be analyzed using 
methods such as, for example, those described herein (including, but not limited 
to, Southern hybridization with PAC probes, FISH analysis and PCR analysis 

20 using DNA sequences specific to the PAC). 

Example 19 

Delivery of Plant Regulatory and Coding Sequences via a Promoterless attBZeo 
Marker Gene in pAg2 onto a MAC Platform 

As described in Examples 6-15, the pfasmid pAg2, comprising plant 

25 regulatory and selectable marker genes (SEQ ID NO: 6; prepared as set forth in 

Example 5) can be used for the production of a MAC containing said plant 

expressible genes. In this example, pAg2, by virtue of the attBZeo DNA 

sequences contained on the plasmid, is used for the loading of plant regulatory 

and selectable marker genes onto MACs in mammalian cells using the attB 

30 sequences to recombine with attP sequences present on a platform MAC. In 
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this example, platform MACs are produced with attP sequences and the plasrnid 
pAg2 is then loaded onto the platform MAC. New MACs so produced are 
useful for introduction into plan cells by virtue of the plant expressible markers 
contained therein. 

5 A. Construction of Platform MAC containing pSV40attPsensePUR (Figure 
7; SEQ ID NO: 26). 

An example of a selectable marker system for the creation of a MAC- 

based platform into which the plasrnid pAg2 can target plant regulatory and 

coding sequences is shown in Figure 7. This system includes a vector 

TO containing the SV40 early promoter immediately followed by (1 ) a 282 base pair 

(bp) sequence containing the bacteriophage lambda attP site and (2) the 

puromycin resistance marker. Initially a Pvu\\IStu\ fragment containing the 

SV40 early promoter from plasrnid pPUR (Clontech Laboratories, Inc., Palo Alto, 

CA; SEQ ID No. 22) was subcloned into the EcoR\/CR\ site of pNEB193 (a 

1 5 PUC1 9 derivative obtained from New England Biolabs, Beverly, MA; SEQ ID No. 

23) generating the plasrnid pSV40193. 

The attP site was PCR amplified from lambda genome (GenBank 

Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 24 

20 attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 25 

After amplification and purification of the resulting fragment, the attP site 

was cloned into the Sma\ site of pSV401 93 and the orientation of the attP site 

was determined by DNA sequence analysis (plasrnid pSV401 93attP). The gene 

encoding puromycin resistance (Puro) was isolated by digesting the plasrnid 

25 pPUR (Clontech Laboratories, Inc. Palo Alto, CA) with Age\IBamH\ followed by 

filling in the overhangs with Klenow and subsequently cloned into the Asc\ site 

downstream of the attP site of pSV40193attP generating the plasrnid 

pSV40193attPsensePUR (Figure 7; SEQ ID NO:26)). 

The plasrnid pSV401 93attPsensePUR was digested with Seal and co- 

30 transfected with the plasrnid pFK161 into mouse LMtk- cells and platform 
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artificial chromosomes were identified and isolated as described herein. Briefly, 
Puromycin resistant colonies were isolated and subsequently tested for artificial 
chromosome formation via fluorescent in situ hybridization (FISH) (using mouse 
major and minor DNA repeat sequences, the puromycin gene and telomeres 
5 sequences as probes), and their fluorescent activating cell sorted (FACS). From 
this sort, a subclone was isolated containing an artificial chromosome, 
designated B19-38. FISH analysis of the B19-38 subclone demonstrated the 
presence of telomeres and mouse minor on the MAC. DOT PCR has been done 
revealing the absence of uncharacterized euchromatic regions on the MAC. The 

10 process for generating this exemplary MAC platform containing multiple site- 
specific recombination sites is summarized in Figure 5. This MAC chromosome 
may subsequently be engineered to contain target gene expression nucleic acids 
using the lambda integrase mediated site-specific recombination system as 
described below. 

15 B. Construction of Targeting Vector. 

The construction of the targeting vector pAg2 is set forth in Example 5 

herein. 

C. Transfection of Promotoriess Marker and Selection With Drug (See 
Figure 9). 

20 The mouse LMtk- cell line containing the MAC B1 9-38 (constructed as 

set forth above and also referred to as a 2 nd generation platform ACE), is plated 
onto four 1 0cm dishes at approximately 5 million cells per dish. The cells are 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% C0 2 . 
The following day the cells are transfected with 5/zg of the vector pAg2 

25 (prepared as described in Example 5 above) and 5/yg of pCXLamlntR (encoding 
a lambda integrase having an E to R amino acid substitution at position 174), 
for a total of 10//g per 10cm dish. Lipofectamine Plus reagent is used to 
transfect the cells according to the manufacturers protocol. Two days post- 
transfection zeocin is added to the medium at 500ug/ml. The cells are 
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maintained in selective medium until colonies are formed. The colonies are then 

ring-cloned and genomic DNA is analyzed. 

D. Analysis Of Clones (PGR, SEQUENCING). 

Genomic DNA (including MACs) is isolated from each of the candidate 
5 clones with the Wizard kit (Promega) and following the manufacturers protocol. 

The following primer set is used to analyze the genomic DNA isolated from the 

zeocin resistant clones: 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 

TCAGTTAGGGTG (SEQ ID NO: 28); Antisense Zeo - 

TGAACAGGGTCACGTCGTCC (SEQ ID NO: 29). PCR amplification using the 
10 above primers and genomic DNA, which included MACs, from the candidate 

clones results in a PCR product indicating the correct sequence for the desired 

site-specific integration event. 

The MACs containing the pAg2 vector are identified and used for transfer 

into plant <such as described in Examples 16 and 17) or animal cells for the 
15 expression of the desired coding sequences contained therein. The MACs 

containing pAg2 carry two plan selectable markers (hygromycin resistance, 

resistance to phosphinothricin) and a visual selectable marker (green fluorescent 

protein). 

Example 20 

20 Construction of Plant-derived Shuttle Artificial Chromosome. 

In another embodiment, the plant artificial chromosomes provided herein 
are useful as selectable shuttle vectors that are able to move one or more 
desired genes back and forth between plant and mammalian cells. In this 
particular embodiment, the plant artificial chromosome is bi-functional in that 
25 proper integration of donor nucleic acid can be selected for in both plant and 
mammalian cells. 

For example, a plant artificial chromosome is prepared as described in 
Examples 6-15 above using ing the plasmid pAg2 (Example 5; SEQ ID NO: 6) 
that has been modified to include the SV4-OattPsensePur coding region from the 
30 plasmid pSV401 93attPsensePur (described above in Example 1 9. A.). Thus, the 
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resulting plant-derived shuttle artificial chromosome contains DNA from the bar 
gene confering resistance to phosphinothricin in plant cells, DNA from the 
hygromycin resistance gene conferring resistance to hygromycin in plant cells, 
both resistance-encoding DNAs under the control of a separate cauliflower 
5 mosaic virus (CaMV) 35S promoter, the attB-promoterless zeomycin resistance- 
encoding DNA, and DNA conferring resistance to puromycin under the control 
of a mammalian SV40 promoter. Accordingly, the presence of the shuttle PAC 
in either a plant or mammalian cell can be selected for by treatment with, for 
example, either hygromycin (plant) or puromycin (mammalian). 

1 0 Because the resulting plant-derived shuttle artificial chromosome contains 

at least one SV40attP site therein similar to the platform MAC prepared in 
Example 19. A. above, a donor vector containing an attB-selectable marker 
sequence, such as a plasmid comprising an attBzeo (e.g. pAg2) can be used to 
selectively introduce desired heterologous nucleic acids from any species (such 

15 as plants, animals, insects and the like) into the shuttle artificial chromosome 
that is present in a mammalian cell. 

Likewise, a plant promoter region, such as CaMV35S, can be used to 
replace the SV40 promoter in the SV40attPPur region of the modified pAg2 
plasmid described above. In this embodiment, because the resulting plant- 

20 derived shuttle artificial chromosome contains at least one CaMV35SattP site 
therein analogous to the platform MAC prepared in Example 19. A. above, a 
donor vector containing an attB-selectable marker sequence, such as a plasmid 
having attBkanamycin, or other plant selectable or scorable marker can be used 
to selectively introduce desired heterologous nucleic acids from any species 

25 (such as plants, animals, insects and the like) into the shuttle artificial 
chromosome that is present in a plant cell. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited by only the scope of the appended 
claims. 
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What is Claimed: 

1. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 

5 selecting a cell comprising an artificial chromosome thatcomprises 

one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
10 sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1, wherein the artificial chromosome is 
predominantly made up of one or more repeat regions. 

15 3. The method of claim 1, wherein the nucleic acid introduced into 

the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

4. The method of claim 1 , wherein the nucleic acid introduced into 
20 the cell comprises one or more nucleic acids selected from the group consisting 

of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA. 

6. The method of claim 5, wherein the rDNA is from a plant selected 
25 from the group consisting of Arabidopsis, Nicotiana, Sofanum, Lycopersicon, 

Daucus, Hordeum, Zea mays, Brass fca, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises animal 

rDNA. 

8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises rDNA 
comprising sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 Solarium, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of cells 
containing the nucleic acid. 

10 12. The method of claim 11, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

1 3. The method of claim 1 2, wherein the protein is a green fluorescent 
protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1, wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ hybridization 
(FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Hefianthus cells. 

1 7. The method of claim 1 6, wherein the cell is a plant protopJast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium gluf osinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

20. A isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
5 euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
artificial chromosome is produced by the method of claim 1 or claim 2. 

10 23. A method of producing a transgenic plant, comprising introducing 

the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23 # wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

25. The method of claim 24, wherein the heterologous nucleic acid 
1 5 encodes a product selected from the group consisting of enzymes, antisense 

RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product selected from the group consisting of vaccines, blood 

20 factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

28. The method of claim 24, wherein the heterologous nucleic acid 
25 encodes a product that provides for an agronomically important trait in the 

plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid is 
contained within a bacterial artificial chromosome (BAC) or a yeast artificial 
chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic DNA 
from a first species of plant; 

introducing the artif icial chromosome into a plant cell of a second 
species of plant; and 
lO detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a neo- 
centomere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
lO comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first plant 
species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 

5 the artificial chromosome comprises a site-specific recombination sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 
the artificial chromosome comprises a site-specific recombination sequence that 
is complementary to the site-specific recombination sequence of the plant cell 

TO of a first plant species. 

44. The method of claim 39, wherein the site-specific recombination 
is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing a first nucleic acid comprising a site-specific 

recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 

introducing a recombinase activity into the plant cell, wherein the 
20 activity catalyzes recombination between the first and second chromosomes 
and whereby an acrocentric plant chromosome is produced. 

46. The method of .claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

47. The method of claim 45, wherein the second nucleic acid is 
25 introduced into the distal end of the arm of the second chromosome, 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and the 
second nucleic acid is introduced into the distal end of the arm of the second 
chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative linkage 
20 into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

51 . The method of any of claims 45-50, wherein the DNA of the short 
arm of the acrocentric chromosome contains less than 5% euchromatic DNA. 

52. The method of any of claims 45-50, wherein the DNA of the short 
30 arm of the acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome, is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

60. The method of claim 4, wherein the nucleic acid comprises plant 
30 rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant species. 
5 64. The method of claim 62, wherein the plant is a monocot plant 

species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1, wherein the cell is a monocot plant cell. 

67. An isolated plant artificial chromosome comprising one or more 
10 repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
15 represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that represent 
euchromatic and heterochromatic nucleic acid. 
25 69. The method of claim 44, wherein the recombinase is selected from 

the group consisting of a bacteriophage PI Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

70. The method of claim 50, further comprising selecting first and 
second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71. The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific recombination 

sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, wherein 

the chromosome contains adjacent regions of rDNA and heterochromatic DNA; 
culturing the cell through at least one cell division; and 
25 selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
chromosome into which the nucleic acid is introduced is an acrocentric 

30 chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of any of claims 76-79, wherein the heterochromatic 
DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplifiable region of a plant 
chromosome. 

82. The vector of claim 81, wherein the amplifiable region comprises 
15 heterochromatic nucleic acid. 

83. The vector of claim 81 , wherein the amplifiable region comprises 

rDNA. 

84. The vector of claim 81 , wherein the sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the vector 

20 to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to facilitate amplification or effect the 
targeting. 

85. The vector of claim 84, wherein the sufficient portion contains at 
least 14, 20, 30, 50, 100, 1 50, 300 or 500 contiguous nucleotides from an 

25 intergenic spacer region. 

86. The vector of claim 81 , wherein the selectable marker encodes a 
product that confers resistance to zeomycin. 

88. The vector of claim 81 , wherein the recognition site comprises an 
att site. 

30 89. The vector claim 81, that is pAglla or pAgllb. 
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90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
5 wherein the agent is not toxic to plant cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably (inked to a plant promoter. 

91 . The vector of claim 90, wherein the recognition site comprises an 
att site. 

10 92. The vector of claim 90, further comprising a sequence of 

nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline synthase 
(NOS) or CaMV35S. 

15 94. The vector of claim 93 that is pAg1 or pAg 2. 

95. The vector of claim 92, wherein the amplifiable region comprises 
heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region comprises 

rDNA. 

20 97. The vector of claim 96, wherein the sequence of nucleotides that 

facilitates amplification of a region of a plant chromosome or targets the vector 
to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to effect the amplification or the 
targeting. 

25 98. The vector of claim 90, wherein the protein is a selectable marker 

that permits growth of plant cells in the presence of an agent normally toxic to 
the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 
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100. The vector of claim 90, wherein the protein is a fluorescent 
protein. 

101. The vector of claim 90, wherein the fluorescent protein is selected 
from the group consisting of green, blue and red fluorescent proteins. 

5 102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 
10 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
103. A vector, comprising: 

a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
15 of a plant chromosome or targets the vector to an arnplifiable region of a plant 
chromosome, wherein the plant is selected from the group consisting of 
Arabidopsis, Nicotiana, Solanum, Lycopersicon , Daucus, Hordeum, Zea mays, 
Brass/ca, Triticum, He //an thus , G/yc/ne, soybean, Gossypfum, cotton, 
Hef/anthus, sunflower and Oryza. 
20 104. The vector of claim 103, wherein the recognition site comprises 

an att site. 

105. A cell, comprising a vector of any of claims 81-104. 

106. The cell of claim 105 that is a plant cell. 
25 107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site that 
recombines with the recognition site in the vector in the presences of the 
recombinase therefor, thereby incorporating the selectable marker that is not 
30 operably associated with any promoter and the nucleic acid encoding a protein 
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operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

1 08. The method of claim 1 07, wherein the recombination sites are att 

sites. 

5 109. The method of claim 107, wherein the animal is a mammal. 

110. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable marker 
that in the vector is not operably associated with a promoter. 

111. The method of any of claims 1O7-110, further comprising, 
10 transferring the resulting platform ACes into a plant cell to produce a plant cell 

the compriese the platform Aces. 

112. The method of claim 111, wherein the resulting platform ACes is 
isolated prior to transfer. 

113. The method of claim 111, wherein the isolated ACes is Introduced 
15 into a plant cell by a method selected from the group consisting of protoplast 

transfection, lipid-mediated delivery, liposomes, electroporation, sonoporation, 
microinjection, particle bombardment, silicon carbide whisker-mediated 
transformation, polyethylene glycol (PEG)-mediated DIMA uptake, lipofection and 
lipid-mediated carrier systems. 
20 114. The method of claim 111, wherein the resulting platform ACes is 

transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant protoplasts. 

116. The method of any of claim 107, wherein the cell is an animal 

cell. 

25 117. The method of claim 116, wherein the animal cell is a mammalian 

cell. 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 
encoded by the nucleic acid that is operably linked to a plant promoter is 
30 expressed. 
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119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 

selecting a plant cell comprising an artificial chromosome that comprises 
5 one or more repeat regions. 

1 20. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

121 . The method of claim 1 19 or claim 120, wherein: 

10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 1 22. The method of claim 119, further comprising isolating the artificial 

chromosome. 

123. A method, comprising: 

introducing a vector into a cell, wherein: 

i) the vector comprises: 

20 a) nucleic acid encoding a selectable marker that is 

not operably associated with any promoter, wherein the selectable 
marker permits growth of animal cells in the presence of an agent 
normally toxic to the animal cells; and wherein the agent is not 
toxic to plant cells; 

25 b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 

a platform plant artifical chromosome (PAC) that comprises 
30 a recombination site and an animal promoter that upon 
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recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a promoter; 

Hi) introduction is effected under conditions whereby the 
vector recombines with the PAC to produce a plant platform PAC that contains 
5 the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein encoded 
by nucleic acid operably linked to an animal promoter is expressed. 

1 24. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

10 125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises nucleic acid encoding a selectable marker. 

1 27. The vector of claim 81, further comprising one or more selectable 
15 markers that when expressed in the plant cell permit the selection of the cell. 

128. A plant transformation vector, comprising: 
a recognition site for recombination; 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplifiable region of a plant 
20 chromosome; and 

one or more selectable markers that when expressed in a plant cell 
permit the selection of the cell; wherein 

the plant transformation vector is for y4oro/>acfer/u/7?-mediated 
transformation of plants. 
25 1 29. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81, 127and 128 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
30 one or more nucleic acid units is (are) repeated in a repeat region; 



WO 02/096923 



PCT/US02/17451 



-214- 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 
5 1 30. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81, 127 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 131. The method of claim 1 23, wherein the cell into which the vector 

is introduced is an animal cell. 

132. The method of claim 131, wherein the cell is a mammalian cell. 
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Fig. 5 Construction of pAglla and pAgllb 
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SEQUENCE LISTING 

<110> CHROMOS MOLECULAR SYSTEMS, INC. 
Perez , Carl 
Fabi j anski , Steven 
Perkins , Edward 

<12 0> Plant Artificial Chromosomes, Uses thereof, and Methods of Preparing 
Plant Artificial Chromosomes 

<130> 24601-419PC 

<140> Not Yet Assigned 
<141> Herewith 

<150> US 60/294,687 
<151> 2001-05-30 

<150> US 60/296,329 
<151> 2001-06-04 

<160> 51 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> pAgl plasmid 
<400> 1 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 

atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 12 0 

agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 

gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 

agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 3O0 

ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 

ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 

acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 4 80 

ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 540 

acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 

agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 6 60 

tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 

tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 7 80 

ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 840 

gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 9 00 

gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 

cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 102 0 

ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 

gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 1140 

tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 120 0 

aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 

aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 

ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 138 0 

ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 

cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 

atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 

accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 162 0 

gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 

gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 174 0 

ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 

cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 

aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 

gcaaggctgc aacgttggcc. agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 
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agttgccggc 
ttaccgagct 
atgagtagat 
accgacgccg 
fcgggttgtct 
cggt cgcaaa 
gaagttgaag 
tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgatt 
gat cgagc t a 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 



ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 
gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaac a 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 



accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 
caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggfcg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggc t tgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcgga t c 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 



agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 
ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 

gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagc tg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 

ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagc cga 
caacatgcta 



ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 
agcccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
cgggaaggga 
aagttctgcc 
cggttaaaca 
gtgacggtat 
gggcggccgg 
ggcaagaacc 

ggccgttttc 

ttcaagacga 
gtgcgcaagc 
caggc tggcc 
ggfcfccctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagcgc 
atcgcggccg 
ggacaagccg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgcgg 
cacagatgcg 
tcgctgcgct 
cggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaa t g t 
actttgccat 
tcctcttcgg 
gtgtcttctt 
aattcggcta 
tgaaagagcc 
tcatactctt 
tcatgccgtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 



ggcaagacca 
aaatgaataa 
acaaccaggc 
cgtaagcggc 
atcggcgtga 
acctggtgga 
cacgrccccgg 
cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagacgggca 
acctggtact 
agggagacaa 
ggcgagccga 
ccacgcacgt 
cccjagggtga 
agtacatcga 
cggacgtgct 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gt acggagc a 
ttccfcgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgacggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
cggt cgt teg 
cagaatcagg 
acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gettttcegt 
cccagttttc 
agcggctgtc 
tgatgeaetc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 



2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
0000 
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gfcttcaaacc 
tctgccgcct 
cgagfcggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 

agtggagata 

cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggfcfc 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagcfc 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tctctcgagc 
cgacgt ctgt 
tctcggaggg 
tgcgggtaaa 
catcggccgc 
cctattgcat 
tgcccgctgt 
gccagacgag 



cggcagctta 
tacaacggct 
ttfctgtgccg 
tgtaaacaaa 
aattaacgcc 
tfcagaaafcfcfc 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggfcg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggt 1 1 c 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaat tgtgag 
agccttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
fccggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
fccgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 

agtggattga 

agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agacgttcca 
ggatgacgca 
catttggaga 
tttcgcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
cgggttcggc 



gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctfca 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 

ggggtccatc 

cgcaatgatg 
agatagctgg 
caatagccct 
cgtgct ccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
agagggtcga 
tagaatgcag 
aaccattata 
cgcgctggag 
agaaggcggc 
agtgcacgca 
cgatctcggt 
actcggcgta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttcgctca 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
t cagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ccgggggggc 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgcggagg 
ccattcggac 



ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 

ggggg at ctg 

agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
cggtatacag 
tgaaaaaaat 
agctgcaata 
gatcatccag 
ggtggaatcg 
gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tcgttcaaga 
tcgtggaaaa 
tggtggagca 
aaagggctat 
gcccagctat 
gccatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
tcacgttgca 
ctatggatgc 
cgcaaggaat 



catcggtaac 
gactgatggg 
ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgfcafc 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgcgca 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
cgggagaacc 
agcac cggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagacgtt 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgat aaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagcct 
cgtctccgac 
aggagggcgt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gatcgctgcg 
cggtcaatac 



at gage aaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
cteggtaegg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
ettgetttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagecctttg 
gctccaccat 
gecgattcat 
caaegcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggegaaetc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gbctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
aggagcatcg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgeage 
ggatatgtcc 
eggcactttg 
gagagectga 
gaaaccgaac 
gecgatctta 
actacatggc 



6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400- 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
97B0 
9840 
9900 
9960 
10020 
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gtgatttcat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
ggccgtcgtt 
tgcagcacat 
ttcccaacag 
tgtcgtttcc 
cctaagagaa 
tccgttcgtc 



atgcgcgatt 
tgcgtccgtc 
ccggcacctc 
aacagcggtc 
catcttcttc 
gaggcatccg 
tgaccaactc 
tcgatgcgac 
cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
ttacaacgtc 
ccccctttcg 
ttgcgcagcc 
cgccttcagt 
aagagcgttt 
catttgtatg 



gctgatcccc 
gcgcaggctc 
gtgcacgcgg 
attgactgga 
tggaggccgt 
gagcttgcag 
tatcagagct 
gcaatcgtcc 
gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
ccgaattaat 
gtgactggga 
ccagctggcg 
tgaatggcga 
ttaaactatc 
at t agaataa 

tg 



atgtgtatca 
tcgatgagct 
atttcggctc 
gcgaggcgat 
ggttggcttg 
gatcgccacg 
tggttgacgg 
gatccggagc 
ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaaccctggc 
taatagcgaa 
atgctagagc 
agtgtttgac 
cggafcattta 



ctggcaaact 
gatgctttgg 
caacaatgtc 
gttcggggat 
tatggagcag 
actccgggcg 
caatttcgat 
cgggactgtc 
tgtagaagta 
at agagt aga 
gtgagtagtt 
atataagaaa 
aattcctaaa 
ttcagatcaa 
gttacccaac 
gaggcccgca 
agcttgagct 
aggatatatt 
aaagggcgtg 



gtgatggacg 
gccgaggact 
ctgacggaca 
tcccaatacg 
cagacgcgct 
tatatgctcc 
gatgcagctt 

gggcgtacac 

ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttggcact 
ttaatcgcct 
ccgatcgccc 
tggatcagat 

sgcgggtaaa 

aaaaggttta 



10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11182 



<210> 2 

<211> 8428 

<212> DMA, 

<213> Artificial Sequence 
<220> 

<223> pCambia3300 plasmid 



<400> 2 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 



cctcgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
ttgaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
atgcggtcgc 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ccaggatgct 
tggcccgcag 
gcctgcgtag 
tgaccgtgtt 
gcgggcgcga 
cggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
cgggtacgtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtcccgta 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
ca ccga cga c 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gacgaaccgt 
ttcgagccgc 
gatgccaagc 
ctaaaaaggt 
atgcgatgag 
accagaaagg 
t cgccggggc 
cggccgtgcg 
gcgacgtgaa 
cggacttggc 
gcccttacga 
tcacggatgg 
tcggcggtga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagcgg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gccatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gccgagttcg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggcggcgcg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
cgggt c aggc 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggcc 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
acaaccaggc 
cgtaagcggc 
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tgggttgtct 
cggtcgcaaa 
gaagttgaag 
tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 

tggcggaaag 
tgccatgcag 
agccttgatt 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgatttt 
ctgtgcataa 
gtcgcfcgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgfcacfcg 
ataccgcafcc 
gctgcggcga 
ggataacgca 
ggccgcgfcfcg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
tttfccaaagt 
caggcagcaa 
gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 



gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgattgga 
cccgattact 
gccgcaggca 
gccggagagt 
ccggagt acg 
cgcaacctga 
caaatfcgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggc c tacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
cfcggcgfcfcfct 
cagaggtggc 
ctcgtgcgct 
t cgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 

agcggtggtt 

gafccctfctga 
afctttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagcfc 
gtccctttat 
ttatatacct 
agttttfcfcca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 



caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagcfcggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
t gt ac cgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
fcccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
t cggaaaaag 
tttttgttfcg 
tcttttctac 
tgcattctag 
caggcttgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
t agcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 
gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 



ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgc aaagaa 
gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcafcfc 
cggccgcctg 
gagcgaaacc 
gatcaeagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cga agagc tg 
gcgtcggcct 
ac c aggg cgc 
cctgcctcgc 
cggtcacagc 

cgggtgttgg 

atactggctt 
tgaaataccg 
gctcactgac 
ggcggt aa t a 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agfctggfcagc 
caagcagcag 
ggggt c tgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ecatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagccga 
caacatgcta 
ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 



agcccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
cgggaaggga 
aagttctgcc 
cggttaaaca 
gtgacggtat 
gggcggccgg 
ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgcgcaagc 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagcgc 
atcgcggccg 
ggacaagccg 
gcgt-ttcggt 
ttgtctgtaa 
cggg t g t egg 
aacfcafcgcgg 
cacagatgeg 
tcgctgcgct 
eggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
ettaceggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgegea 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgecat 
tcctcttcgg 
gtgtcttctt 
aatteggcta 
tgaaagagee 
tcatactctt 
teatgeegtt 
atgtcctttt 
aaatafcaggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttfc 
ttttgaaacc 
ccctccgcga 
categgtaac 
gactgatggg 
ttggctggct 
taacacattg 



atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
tttbcgttcc 
ccgttttccg 
cagaegggea 
acctggtact 
agggagacaa 
ggegage cga 
ccacgcacgt 
ccgagggtga 
agtacatcga 
eggaegtget 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtaeggagea 
fctccfcgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgaeggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
cggfccgttcg 
cagaatcagg 
accgt aaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gettttcegt 
cccagttttc 
agcggctgtc 
tgatgeaetc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 
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taatgtactg 
gfctttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
tttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
tcagafcfcgtc 
ggtaaaccta 
ggtttatccg 



aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggfccgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgchtgtctc 
cacggcggat 
t gt agagaga 
a gagga agg t 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagfcgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agctcggtac 
gtcgttttac 
gcacatcccc 
caacagttgc 
gfcfctcccgcc 
agagaaaaga 
ttcgtccatt 



gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgfcagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatatfcac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
ccggggatcc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttcagtttaa 
gcgtttatta 
tgtatgtg 
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gggggatctg 

agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
a t ag t ggga t: 
tgaagacgtg 
tttgggacca 
gcat ttgt ag 
gcaafcggaat 
ttggfcefctct 
catgttatca 
gatgctcctc 
gatagccttt 
ccfctttgatg 
ccfcttgfctga 
ggagtagacg 
ccgccfcctec 
tggaaagcgg 
caggctttac 
tttcacacag 
tctagagtcg 
ctgggaaaac 
ctggcgtaat 
tggcgaatgc 
actatcagtg 
gaataacgga 



gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gtfccctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 

gtgggtgggg 

cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acctgcaggc 
cctggcgtta 
agcgaagagg 
tagagcagct 
tttgacagga 
tatttaaaag 



ctggatfcttg 
acatactaag 
cccttatctg 
gggcagga cc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcafcettg 
ccttttctac 
ttcccgatat 
cfcttgatatt 
cctgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgafctcat 
caacgcaafct 
tccggctcgt 
tgaccatgat 
atgcaagctt 
cccaacttaa 
cccgcaccga 
tgagcttgga 
tatattggcg 
ggcgtgaaaa 
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<213> Artificial Sequence 
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<300> 

<308> Genbank #AF234298 
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<400> 3 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 



ctgactagta 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
fccgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 



acttttcact 
attttctgtc 
tatttgcact 
fcggtgfctcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggatacg 
gctgaagtca 
ttcaaggagg 
gtatacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt 
cccgcaatta 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 
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acgcgat aga 
ctatgttact 
cctaagagaa 
tccgttcgtc 
ttgatccaac 
tctgaaaacg 
tcctggcgtt 
cggagacatt 
agcaccgacg 
aagctgtttt 
cttgaccacc 
agcacccgcg 
agcctggcag 
ttcgccggca 
gaggccgc ca 
atcgcgcacg 
ctgcttggcg 
cccaccgagg 
ctggcggccg 
aggacgaacc 
tgttcgagcc 
ctgatgccaa 
gtctaaaaag 
tgatgcgatg 
taaccagaaa 
actcgccggg 

ggcggccgtg 

ccgcgacgtg 
ggcggac t tg 
aagcccttac 
ggtcacggat 
catcggcggt 
tatcacgcag 
agaacccgag 
actcatttga 
ggccgtccga 
gccatgaagc 
gcggtacgcc 
gagtaaatga 
ggaaaatcaa 
cggttggcca 
caagcccgag 
cgctgggtga 
tcgaggcaga 
aatcccggca 
agcaaccaga 
tcatggacgt 
gctacgagct 
tgtgggatta 
ac cgggaagg 
tcaagtfcctg 
ttcggttaaa 
tggtgacggt 
ccgggcggcc 
aaggcaagaa 
tcggccgttt 
tgttcaagac 
ccgtgcgcaa 
ggcaggctgg 
ccggttccta 
gaaaaggtct 
accggaaccc 
tgactgatat 
aaactcttaa 
tgcaaaaagc 
c t at cgcggc 
gcggacaagc 



aaacaaaata 
aga t cgggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgca 
ttcttgtcgc 
acgccatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggcccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttttcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agtaaataaa 
ggcgggtcag 
gccgatgttc 
cgggaagatc 
aaggccatcg 
gctgtgtccg 
gacatatggg 
ggaaggctac 
gaggttgccg 
cgcgtgagct 
ggcgacgctg 
gttaatgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgacctggtg 
agcacgcccc 
accgccggca 
ttttttcgtt 
ggccgttttc 
tccagacggg 
cgacctggta 
gaagggagac 
ccggcgagcc 
caccacgcac 
atccgagggt 
ggagtacatc 
cccggacgtg 
tctctaccgc 
gatctacgaa 
gctgatcggg 
cccgatccta 
atgtacggag 
ctttcctgtg 
gfcacattggg 
aaaagagaaa 
aacccgcctg 
gcctaccctt 
cgctggccgc 
cgcgccgtcg 



tagcgcgcaa 
ttaaactatc 
atfcagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 
gtgttttagt 
caagagcgcc 
gaccaaccaa 
caccggcacc 
cgacgttgtg 
cattgccgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
acgccaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgca 
gcaagacgac 
tgttagtcga 
aaccgctaac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aagcggcctt 
aggcgctggc 
acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gacggtcgca 
gagaag t tga 

ggtgaat cgt 

gccggtgcgc 
ccgatgctct 
cgtctgtcga 
cacgtagagg 
ctgatggcgg 
aagcccggcc 
gatggcggaa 
gttgccatgc 
gaagccttga 
gagatcgagc 
ctgacggttc 
ctggcacgcc 
cgcagtggca 
tcaaatgacc 
gtcatgcgct 
cagatgctag 
gatagcacgt 
aacccaaagc 
aaaggcgatt 
gcctgtgcat 
cggtcgctgc 
tcaaaaatgg 
ccactcgacc 



actaggataa 
agtgtttgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 
cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggccgca 
cgcgcacttg 
cgtgaggacg 
gaacaagcafc 
tcgaggcgga 
tgcggctgca 
gcttggccgc 
acagcttgcg 
aggggaa cgc 
catcgcaacc 
ttccgatccc 
cgttgtcggc 
cttcgtagtg 
agccgacttc 
cctggtggag 
tgtcgtgtcg 
cgggtacgag 
tgccgccgcc 
ccaggcgctg 
atgagcaaaa 
gcaacgttgg 
gcggaggatc 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aaccatccgg 
aggc cgcgca 
ggcaagcggc 
cgtcgattag 
atgacgtggg 
agcgtgaccg 
tttccgcagg 
tttcccatct 
gcgtgttccg 
agcagaaaga 
agcgtacgaa 
ttagccgcta 
tagctgattg 
accccgatta 
gcgccgcagg 
gcgccggaga 
tgccggagta 
accgcaacct 
ggcaaattgc 
acat tgggaa 
cgtacattgg 
tttccgccta 
aactgtctgg 
gctccctacg 
ctggcctacg 
gccggcgccc 



attatcgcgc 
aggatatatt 
aaagggcgtg 
cccctcggga 
gacgttcagt 
aggctgccgc 
agaatacttg 
tgctgggcta 
tgcacgcggc 
gcccggagct 
ggctagaccg 
aggccggcgc 
gccgcatggt 
accgcacccg 
ctaccctcac 
ccgtgaaaga 
agcgcagcga 
cat fcgaccga 
gaaaccgcac 
gatgatcgcg 
tgaaatcctg 
tgaagaaacc 
tcatgcggtc 
atgaaggtta 
catctagccc 
cagggcagtg 
atcgaccgcc 
atcgacggag 
gtgctgattc 
ctggttaagc 
cgggcgatca 
ctgcccattc 
ggcacaaccg 
gccgctgaaa 
gcacaaacac 
ccagcctggc 
acaccaagct 
aatacatcgc 
gcggctaaag 
cccatgtgtg 
tgcaatggca 
cccggtacaa 
ggccgcccag 
cgctgatcga 
gaagccgccc 
cacccgcgat 
acgagctggc 
gccggccggc 
aaccgaatcc 
tccacacgtt 
cgacctggta 
gaaggccaag 
caagatcgta 
gatgtaccgc 
ctttttgatc 
caaggcagaa 
gttcaagaag 
cgatttgaag 
gatcgagggc 
cctagcaggg 
cccaaagccg 
gaaccggtca 
aaactcttta 
ccagcgcaca 
ccccgccgct 
gccaggcaa t 
acatcaaggc 



gcggtgtcat 
ggcggg t aaa 
aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 
cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagcgggcgc 
cccggcacag 
ggcggctgca 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
gccgggtacg 
gccggtttgt 
gagcgccgcc 
gctgcgtata 
tcgctgtact 
gcgccctgca 
cccgcgattg 
cgacgattga 
cgccccaggc 
cggtgcagcc 
agcgcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
gctaagtgcc 
agacacgcca 
gaagatgtac 
gcagctacca 
gaggcggcat 
gaggaacggg 
ctggaacccc 
atcggcgcgg 
cggcaacgca 
atccgcaaag 
aagggcgacg 
agtcgcagca 
gaggtgatcc 
atggccagtg 
atgaaccgat 
gcggacgtac 
gaaacctgca 
aacggccgcc 
aagagcgaaa 
gagatcacag 
gatcccggca 
gccagatggt 
ttctgtttca 
gaggaggcgg 
gaagcatccg 
gaaaaaggtc 
tacattggga 
cacatgtaag 
aaacttatta 
gc cgaagagc 
tcgcgtcggc 
ctaccagggc 
accctgcctc 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
24O0 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
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gcgcgttfccg gfcgatgacgg tgaaaacctc 
gcttgtctgt aagcggatgc cgggagcaga 
99 c 9ggtgtc ggggcgcagc catgacccag 
ttaactatgc ggcatcagag cagattgtac 
cgcacagatg cgtaaggaga aaataccgca 
actcgctgcg ctcggtcgtt cggctgcggc 
tacggttatc cacagaatca ggggataacg 
aaaaggccag gaaccgtaaa aaggccgcgt 
ctgacgagca tcacaaaaat cgacgctcaa 
aaagatacca ggcgtttccc cctggaagct 
cgcttaccgg atacctgtcc gcctttctcc 
cacgctgtag gtatctcagt fccggfcgtagg 
aaccccccgt tcagcccgac cgctgcgcct 
cggtaagaca cgacttatcg ccactggcag 
ggtatgtagg cggtgctaca gagttcttga 
ggacagtatt tggtatctgc gctctgctga 
gctcttgatc cggcaaacaa accaccgctg 
agattacgcg cagaaaaaaa ggatctcaag 
acgctcagtg gaacgaaaac tcacgttaag 
acaattcatc cagtaaaata taatatttta 
agtcaaaaaa tagctcgaca tactgttctt 
agaaggcaat gtcataccac ttgtccgccc 
ttactttgcc atctttcaca aagatgttgc 
gttcctcttc gggcttttcc gfccttfcaaaa 
gagtgtcttc ttcccagttt tcgcaatcca 
ccaattcggc taagcggctg tctaagctat 
agtgaaagag cctgatgcac tccgcataca 
cttcatactc ttccgagcaa aggacgccat 
catcatgccg ttcaaagtgc aggacctttg 
tcatgtcctt ttcccgttcc acatcatagg 
ttaaatatag gttttcattt tctcccacca 
ccgtatcttt tacgcagcgg tatttttcga 
ttttagccat ttattatttc cttcctcttt 
taattataac aagacgaact ccaattcact 
gaaaacagct ttttcaaagt tgttttcaaa 
gatfcttgaaa ccgcggtgat cacaggcagc 
taccctccgc gagatcatcc gtgtttcaaa 
agcatcggta acatgagcaa agtctgccgc 
cggactgatg ggctgcctgt atcgagfcggt 
tgttggctgg ctggtggcag gatatattgt 
aataacacat tgcggacgtt tttaatgtac 
tggattttag tactggattt tggttttagg 
acaaatacaa atacatacta agggtttctt 
ggaaccctaa ttcccttatc tgggaactac 
gtcgatcgac agatccggtc ggcatctact 
gcgtcggttt ccactatcgg cgagtacttc 
tctgcgggcg atttgfcgtac gcccgacagt 
tcgaccctgc gcccaagctg catcatcgaa 
gtcaagacca atgcggagca tatacgcccg 
cctccgctcg aagtagcgcg tctgctgctc 
gatgttggcg acctcgtatt gggaatcccc 
tgttatgcgg ccattgtccg tcaggacatt 
ccggacttcg gggcagtcct cggcccaaag 
cgcactgacg gtgtcgtcca tcacagtttg 
gcatatgaaa tcacgccatg tagtgfcatfcg 
cccgctcgtc tggctaagat cggccgcagc 
tagaacagcg ggcagttcgg tttcaggcag 
ggagatgcaa taggtcaggc tctcgctaaa 
gagcgcggcc gatgcaaagt gccgataaac 
gctatttacc cgcaggacat atccacgccc 
ttcgccctcc gagagctgca tcaggtcgga 
ctcgacagac gtcgcggtga gttcaggctt 
gaaagctcga gagagataga tttgtagaga 
aatgaaatga acttccttat atagaggaag 
atcccttacg tcagtggaga tatcacatca 
gtcttctttt tccacgatgc tcctcgtggg 
agaggcatct tgaacgatag cctttccttt 
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tgacacatgc agctcccgga gacggtcaca 5040 
caagcccgtc agggcgcgtc agcgggtgtt 5100 
tcacgtagcg atagcggagt gtatactggc 5160 
tgagagtgca ccatatgcgg tgtgaaatac 52 20 
tcaggcgctc ttccgcttcc tcgctcactg 52 80 
gagcggtatc agctcactca aaggcggtaa 534 0 
caggaaagaa catgtgagca aaaggccagc 54 0 0 
tgctggcgtt tttccatagg ctccgccccc 5460 
gtcagaggtg gcgaaacccg acaggactat 552 0 
ccctcgtgcg ctctcctgtt ccgaccctgc 5580 
cttcgggaag cgtggcgctt tctcatagct 5640 
tcgttcgctc caagctgggc tgtgtgcacg 570 0 
tatccggtaa ctatcgtctt gagtccaacc 5760 
cagccactgg fcaacaggatt agcagagcga 5820 
agtggtggcc taactacggc tacactagaa 58 80 
agccagttac cttcggaaaa agagttggta 5940 
gtagcggtgg tttttttgtt tgcaagcagc 6000 
aagatccttt gatcttttct acggggtctg 60 60 
ggafctfctggt catgcattct aggtacta'aa 612 0 
ttttctccca atcaggcttg atccccagta 6180 
ccccgatatc ctccctgatc gaccggacgc 624 0 
tgccgcttct cccaagatca ataaagccac 63 0 0 
tgtctcccag gtcgccgtgg gaaaagacaa 63 60 
aatcatacag ctcgcgcgga tctttaaatg 6420 
catcggccag atcgttattc agtaagtaat 6480 
fccgtataggg acaatccgat atgtcgatgg 6540 
gctcgataat cttttcaggg ctttgttcat 66 0 0 
cggcctcact catgagcaga ttgctccagc 66 6 0 
gaacaggcag ctttccttcc agccatagca 6720 
tggtcccttt ataccggctg tccgtcattt 6780 
gcttatatac cttagcagga gacattcctt 6840 
tcagtttttt caattccggt gatattctca 6900 
tctacagtat ttaaagatac cccaagaagc 6960 
gttccttgca ttctaaaacc ttaaatacca 702 0 
gttggcgtat aacatagtat cgacggagcc 7O80 
aacgctctgt catcgttaca atcaacatgc 7140 
cccggcagct tagttgccgt tcttccgaat 7200 
cttacaacgg ctctcccgct gacgccgtcc 7260 
gattttgtgc cgagctgccg gtcggggagc 732 0 
ggtgtaaaca aattgacgct tagacaactt 73 80 
tgaattaacg ccgaattaat tcgggggatc 7440 
aattagaaat tttattgata gaagtattfct 7500 
atatgctcaa cacatgagcg aaaccctata 756 0 
tcacacatta ttatggagaa actcgagctt 762 0 
ctatttcttt gccctcggac gagtgcfcggg 7680 
tacacagcca tcggtccaga cggccgcgct 774 0 
cccggctccg gatcggacga ttgcgtcgca 7800 
attgccgtca accaagctct gatagagttg 786 0 
gagtcgtggc gatcctgcaa gctccggatg 792 0 
catacaagcc aaccacggcc tccagaagaa 7980 
gaacatcgcc tcgctccagt caatgaccgc 8040 
gttggagccg aaatccgcgt gcacgaggtg 8100 
catcagctca tcgagagcct gcgcgacgga 816 0 
ccagtgatac acatggggat cagcaatcgc 822 0 
accgattcct tgcggtccga atgggccgaa 82 80 
gatcgcatcc atagcctccg cgaccggttg 834 0 
gtcttgcaac gtgacaccct gtgcacggcg 84 00 
ctccccaatg tcaagcactt ccggaatcgg 8460 
ataacgatct ttgtagaaac catcggcgca 8520 
tcctacatcg aagctgaaag cacgagattc 85 80 
gacgctgtcg aacttttcga tcagaaactt 864 0 
fcttcatatct cattgccccc cgggatctgc 8700 
gagactggtg atttcagcgt gtcctctcca 8760 
gtcttgcgaa ggatagtggg attgtgcgtc 8820 
atccacttgc tttgaagacg tggttggaac 8880 
tgggggtcca tctttgggac cactgtcggc 894 0 
atcgcaatga tggcatttgt aggtgccacc 90 00 
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ttccttttcfc actgtccttt tgatgaagtg acagatagct gggcaatgga atccgaggag 9060 

gtttcccgafc attacccttt gttgaaaagt ctcaatagcc ctttggfcctt ctgagactgt 912 0 

atctttgata ttcttggagt agacgagagt gtcgtgctcc accatgttat cacatcaatc 9180 

cacttgcttt gaagacgtgg ttggaacgtc ttctttttcc acgatgctcc tcgtgggtgg 9240 

gggtccatct ttgggaccac tgtcggcaga ggcatcttga acgatagcct ttcctttatc 93 00 

gcaatgatgg catttgtagg tgccaccttc cttttctact gtccttttga tgaagtgaca 9360 

gatagctggg caatggaatc cgaggaggtt tcccgatatt accctttgtt gaaaagtctc 942 0 

aatagccctt tggtcttctg agactgtatc tttgatattc ttggagtaga cgagagtgtc 9480 

gtgctccacc atgttggcaa gctgctctag ccaatacgca aaccgcctct ccccgcgcgt 9540 

tggccgattc attaatgcag ctggcacgac aggtttcccg actggaaagc gggcagtgag 960 0 

cgcaacgcaa ttaatgtgag ttagctcact cattaggcac cccaggcttt acactttatg 9660 

cttccggctc gtatgttgtg tggaattgtg agcggataac aatttcacac aggaaacagc 972 0 

tatgaccatg attacgaatt cgagctcggt acccggggat cctctagagt cgacctgcag 97 80 

gcatgcaagc ttggcactgg ccgtcgtttt acaacgtcgt gactgggaaa accctggcgt 9840 

tacccaactt aatcgccttg cagcacatcc ccctttcgcc agctggcgta atagcgaaga 9900 

ggcccgcacc gatcgccctt cccaacagtt gcgcagcctg aatggcgaat gctagagcag 9960 

cttgagcttg gatcagattg tcgtttcccg ccttcagttt agcttcatgg agtcaaagat 1002 0 

tcaaatagag gacctaacag aactcgccgt aaagactggc gaacagttca tacagagtct 10080 

cttacgactc aatgacaaga agaaaatctt cgtcaacatg gtggagcacg acacacttgt 10140 

ctactccaaa aatatcaaag atacagtctc agaagaccaa agggcaattg agacttttca 10200 

acaaagggta atatccggaa acctcctcgg attccattgc ccagctatct gtcactttat 102 60 

tgtgaagata gtggaaaagg aaggtggctc ctacaaatgc catcattgcg ataaaggaaa 1032 0 

ggccatcgtt gaagatgcct ctgccgacag tggtcccaaa gatggacccc cacccacgag 103 80 

gagcatcgtg gaaaaagaag acgttccaac cacgtcttca aagcaagtgg attgatgtga 1044 0 

tatctccact gacgtaaggg atgacgcaca atcccactat ccttcgcaag acccttcctc 10500 

tatataagga agttcatttc atttggagag aacacggggg actcttgac 10549 

<210> 4 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV3 5SpolyA Primer 
<400> 4 

ctgaattaac gccgaattaa ttcgggggat ctg 33 

<210> 5 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35Spr Primer 
<400> 5 

ctagagcagc ttgccaacat ggtggagca 2 9 

<210> 6 
<211> 12592 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAg2 Plasmid 
<400> 6 

gtacgaagaa ggccaagaac ggccgcctgg tgacggtatc cgagggtgaa gccttgatta 60 
gccgctacaa gatcgtaaag agcgaaaccg ggcggccgga gtacatcgag atcgagctag 12 0 
ctgattggat gtaccgcgag atcacagaag gcaagaaccc ggacgtgctg acggttcacc 180 
ccgattactt tttgatcgat cccggcatcg gccgttttct ctaccgcctg gcacgccgcg 24 0 
ccgcaggcaa ggcagaagcc agatggttgt tcaagacgat ctacgaacgc agtggcagcg 30 0 
ccggagagtt caagaagttc tgtttcaccg tgcgcaagct gatcgggtca aatgacctgc 360 
cggagtacga tttgaaggag gaggcggggc aggctggccc gatcctagtc atgcgctacc 42 0 
gcaacctgat cgagggcgaa gcatccgccg gttcctaatg tacggagcag atgctagggc 480 
aaattgccct agcaggggaa aaaggtcgaa aaggtctctt tcctgtggat agcacgtaca 540 
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ttgggaaccc 
acattgggaa 
ccgcctaaaa 
tgtctggcca 
ccctacgccc 
gcctacggcc 
ggcgcccaca 
cacatgcagc 
gcccgtcagg 
cgtagcgata 
gagtgcacca 
ggcgctcttc 
cggtatcagc 
gaaagaacat 
fcggcgttfcfcfc 
agaggtggcg 
tcgtgcgctc 
cgggaagcgfc 
ttcgctccaa 
ccggtaacfca 
ccactggtaa 

ggtggcctaa 

cagttacctt 
gcggtggfctt 
atcctttgat 
ttttggtcat 
tctcccaatc 
cgatatcctc 
cgcttctccc 
ctcccaggtc 
catacagctc 
cggccagatc 
tatagggaca 
cgafcaafccfcfc 
cctcactcat 
caggcagctt 
tccctttata 
tatatacctt 
gttttttcaa 
acagtattta 
ccttgcattc 
ggcgtataac 
gctctgtcat 
ggcagcttag 
acaacggctc 
tttgtgccga 
gtaaacaaat 
attaacgccg 
tagaaatttt 
tgctcaacac 
cacattatta 
taccggcagg 
gccggccgcc 
cgggtcgtfcg 
cttcagcagg 
gtacacggtc 
ggcgatgccg 
acggacgagg 
gcttgtctcg 
acggcggatg 
gtagagagag 
gaggaaggtc 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 



aaagccgtac 
ccggtcacac 
ctctttaaaa 
gcgcacagcc 
cgccgctfccg 
aggcaatcta 
tcaaggcacc 
tcccggagac 
gcgcgtcagc 
gcggagtgta 
fcatgcggtgt 
cgcttcctcg 
tcactcaaag 
gtgagcaaaa 
ccataggctc 
aaacccgaca 
tcctgttccg 
ggcgcfcfctct 
gctgggctgt 
fccgtcfcfcgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgfcttgc 
cttttctacg 
gcattctagg 
aggcttgatc 
cctgatcgac 
aagatcaata 
gccgtgggaa 
gcgcggatct 
gttattcagt 
atccgatatg 
ttcagggctt 
gagcagattg 
tccttccagc 
ccggctgtcc 
agcaggagac 
ttccggtgat 
aagatacccc 
taaaacctta 
atagtatcga 
cgttacaatc 
tfcgccgttct 
tcccgctgac 
gctgccggtc 
tgacgcttag 
aattaattcg 
attgatagaa 
atgagcgaaa 
tggagaaact 
ctgaagtcca 
cgcagcatgc 
ggcagcccga 
tgggfcgtaga 
gactcggccg 
gcgaccfccgc 
tcgtccgtcc 
atgtagtggt 
tcggccgggc 
actggtgatt 
ttgcgaagga 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 



attgggaacc 
atgtaagtga 
cttattaaaa 
gaagagctgc 
cgtcggccta 
ccagggcgcg 
ctgcctcgcg 
ggtcacagct 
gggfcgttggc 
tactggctta 
gaaataccgc 
ctcactgact 
gcggtaatac 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 
catagctcac 
gtgcacgaac 
tccaacccgg 
agagcgaggt 
actagaagga 
gttcjgtagct 
aagcagcaga 
gggtctgacg 
tactaaaaca 
cccagtaagt 
cggacgcaga 
aagccactta 
aagacaagfct 
ttaaatggag 
aagtaatcca 
tcgatggagt 
tgttcatctt 
ctccagccat 
catagcatca 
gtcattttta 
attccttccg 
attctcattt 
aagaagctaa 
aataccagaa 
cggagccgat 
aacafcgctac 
tccgaatagc 
gccgtcccgg 
ggggagctgt 
acaacttaat 
ggggatctgg 
gtattttaca 
ccctatagga 
cgagtcaaat 
gctgccagaa 
cgcggggggc 
tgacagcgac 
gcgtggagcc 
tccagtcgta 
cgtccacctc 
actcctgcgg 
tgacgatggt 
gtcgttctgg 
tcagcgtgtc 
tagtgggatt 
gaagacgtgg 
ttgggaccac 
cattfcgtagg 
caatggaatc 
tggfccttctg 



ggaacccgta 
ctgatataaa 
ctcttaaaac 
aaaaagcgcc 
tcgcggccgc 
gacaagccgc 
cgtttcggtg 
tgtctgtaag 
gggtgtcggg 
actatgcggc 
acagatgcgt 
cgctgcgctc 
ggttatccac 
aggccaggaa 
acgagcatca 
gataccaggc 
ttaccggata 
gctgtaggta 
cccccgttca 
taagacacga 
atgtaggcgg 
cagtatttgg 
cttgatccgg 
ttacgcgcag 
ctcagfcggaa 
attcatccag 
caaaaaatag 
aggcaatgtc 
ctttgccatc 
cctcttcggg 
tgtcttcttc 
attcggctaa 
gaaagagcct 
catactcttc 
catgccgttc 
tgtccttttc 
aatataggtt 
tatcttttac 
tagccattta 
ttataacaag 
aacagctttt 
tttgaaaccg 
cctccgcgag 
atcggtaaca 
actgatgggc 
tggcfcggctg 
aacacattgc 
attttagtac 
aatacaaata 
accctaattc 
ctcggtgacg 
acccacgtca 
atatccgagc 
cacgctcttg 
cagtcccgtc 
ggcgttgcgt 
ggcgacgagc 
ttcctgcggc 
gcagaccgcc 
gctcatggta 
ctctccaaat 
gtgcgtcatc 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 



cafctgggaac 
agagaaaaaa 
ccgcctggcc 
tacccttcgg 
tggccgctca 
gccgtcgcca 
atgacggtga 
cggatgccgg 
gcgcagccat 
atcagagcag 
aaggagaaaa 
ggtcgttcgg 
agaatcaggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 
cctgtccgcc 
tctcagttcg 
gcccgaccgc 
cttatcgcca 
tgctacagag 
tatctgcgct 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
taaaatataa 
ctcgacatac 
ataccacttg 
tttcacaaag 
cttttccgtc 
ccagttttcg 
gcggctgtct 
gatgcactcc 
cgagcaaagg 
aaagtgcagg 
ccgttccaca 
ttcattttct 
gcagcggtat 
ttatttcctt 
acgaactcca 
tcaaagttgt 
cggtgatcac 
atcatccgtg 
tgagcaaagt 
tgcctgtatc 

gtggcaggat 
ggacgttttt 
fcggattfctgg 
catactaagg 
ccttatctgg 
ggcaggaccg 
tgccagttcc 
gcctcgtgca 
aagccctgtg 
cgctggtggc 
gccttccagg 
cagggat age 
teggtaegga 
ggcafcgfcccg 
gactcgagag 
gaaatgaact 
ccttacgtca 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 



ccaaagccgt 
ggcgattttt 
tgtgcataac 
tcgctgcgct 
aaaatggctg 
ctcgaccgcc 
aaacctctga 
gagcagacaa 
gacccagtca 
attgtactga 
taccgcatca 
ctgeggegag 
gataaegcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 
tttctccctt 
gtgtaggtcg 
tgegecttat 
ctggcagcag 
ttcttgaagt 
ctgetgaage 
accgctggta 
tctcaagaag 
cgttaaggga 
tattttattt 
tgttcttccc 
tccgccctgc 
atgttgcfcgt 
tttaaaaaat 
caatccacat 
aagctattcg 
gcatacagct 
acgccatcgg 
acctttggaa 
tcataggtgg 
cccaccagct 
ttttcgatca 
cctcttttct 
attcactgtt 
tttcaaagtt 
aggcagcaac 
tttcaaaccc 
ctgccgcctt 
gagfcggtgat 
atattgtggt 
aatgtactga 
ttttaggaat 
gtttcttata 
gaactactca 
gaeggggegg 
cgtgcttgaa 
tgcgcacgct 
cctccaggga 

ggggggagac 

ggcccgcgta 
gctcccgcag 
agttgaccgt 
cctcggtggc 
agatagattt 
tccttatata 
gtggagatat 
acgatgctcc 
aegatagect 
gtccttttga 
accctttgtt 
ttggagtaga 
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cgagagfcgtc gtgctccacc atgttatcac 
gaacgtcttc tttttccacg atgctcctcg 
cggcagaggc atcttgaacg atagcctttc 
caccttcctt ttctactgtc cttttgatga 
ggaggtttcc cgatattacc ctttgttgaa 
ctgtatcttt gatattcttg gagtagacga 
gctctagcca atacgcaaac cgcctctccc 
gcacgacagg tttcccgact ggaaagcggg 
gctcactcat taggcacccc aggctttaca 
aattgtgagc ggataacaat ttcacacagg 
gccttgacta gagggtcgac ggtatacaga 
aaccacaact agaatgcagt gaaaaaaatg 
tttatttgta accattataa gctgcaataa 
tgagatcccc gcgctggagg atcatccagc 
acctttcata gaaggcggcg gfcggaafccga 
cggccacgaa gtgcacgcag tfcgccggccg 
gctgctcgcc gatctcggtc atggccggcc 
cctccgacca ctcggcgtac agctcgtcca 
tgtccggcac cacctggtcc tggaccgcgc 
caccggcgaa gtcgtcctcc acgaagtccc 
cgaccgctcc ggcgacgtcg cgcgcggtga 
tggatccaga tttcgctcaa gttagtataa 
atcgacactc tcgtctactc caagaatatc 
attgagactt ttcaacaaag ggtaatatcg 
atctgtcact tcatcaaaag gacagtagaa 
tgcgataaag gaaaggctat cgttcaagat 
cccccaccca cgaggagcat cgtggaaaaa 
gtggattgafc gtgataacat ggtggagcac 
gatacagtct cagaagacca aagggctatt 
aacctcctcg gattccattg cccagctatc 
gaaggtggca cctacaaatg ccatcattgc 
tctgccgaca gtggtcccaa agatggaccc 
gacgttccaa ccacgtcttc aaagcaagtg 
gatgacgcac aatcccacta tccttcgcaa 
atttggagag gacacgctga aatcaccagt 
ttcgcagatc cgggggggca atgagatatg 
gagaagtttc tgatcgaaaa gttcgacagc 
gaagaatctc gtgctttcag cttcgatgta 
agctgcgccg atggtttcta caaagafccgt 
ctcccgattc cggaagtgct tgacattggg 
tcccgccgtg cacagggtgt cacgttgcaa 
ctacaaccgg tcgcggaggc tratggatgcg 
gggttcggcc cattcggacc gcaaggaatc 
fcgcgcgafctg ctgatcccca tgtgtatcac 
gcgtccgtcg cgcaggctct cgatgagctg 
cggcacctcg fcgcacgcgga tttcggctcc 
acagcggtca ttgactggag cgaggcgatg 
atcttcttct ggaggccgtg gttggcfcfcgt 
aggcatccgg agcttgcagg atcgccacga 
gaccaactct atcagagctt ggttgacggc 
cgatgcgacg caatcgtccg atccggagcc 
agaagcgcgg ccgtcfcggac cgatggctgt 
cgccccagca ctcgtccgag ggcaaagaaa 
gacaagctcg agtttctcca taataatgtg 
tcctataggg fcttcgctcafc gtgttgagca 
tgtaaaatac ttctatcaat aaaatttcta 
ccagatcccc cgaattaatfc cggcgttaat 
tacaacgtcg tgactgggaa aaccctggcg 
cccctttcgc cagctggcgt aatagcgaag 
tgcgcagcct gaatggcgaa tgctagagca 
gccttcagtt tggggatcct ctagactgaa 
agaattaagg gagtcacgtt atgacccccg 
tggaactgac agaaccgcaa cgttgaagga 
tgagctaagc acatacgtca gaaaccatta 
atcagctagc aaatatttct tgtcaaaaat 
gtatccaatt agagtctcat attcactctc 
atcgaattcc cgcggccgcc atggtagatc 
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atcaatccac ttgctttgaa gacgtggttg 4620 
tgggtggggg tccatctttg ggaccactgt 4680 
ctttatcgca atgatggcafc ttgtaggtgc 474 0 
agtgacagat agctgggcaa tggaatccga 4800 
aagtctcaat agccctttgg tcttctgaga 4860 
gagtgtcgtg ctccaccatg ttggcaagct 4920 
cgcgcgttgg ccgattcatt aatgcagctg 4980 
cagtgagcgc aacgcaatta atgtgagtta 504 O 
ctttatgctt ccggctcgta tgttgtgtgg 5100 
aaacagctat gaccatgatt acgaattcga 5160 
catgataaga tacattgatg agtttggaca 522 0 
ctttafcttgt gaaatttgtg atgctattgc 5280 
acaagttggg gtgggcgaag aactccagca 534 0 
cggcgtcccg gaaaacgatt ccgaagccca 540 0 
aatctcgtag cacgtgtcag tcctgctcct 54 60 
ggtcgcgcag ggcgaactcc cgcccccacg 5520 
cggaggcgfcc ccggaagttc gtggacacga 55 80 
ggccgcgcac ccacacccag gccagggtgt 5640 
tgatgaacag ggtcacgtcg tcccggacca 57 00 
gggagaaccc gagccggtcg gtccagaact 5760 
gcaccggaac ggcactggtc aacttggcca 582 0 
aaaagcaggc ttcaatcctg caggaafcfccg 588 0 
aaagatacag tctcagaaga ccaaagggct 5940 
ggaaacctcc tcggattcca ttgcccagct 6000 
aaggaaggtg gcacctacaa atgccatcat 6060 
gcctctgccg acagtggtcc caaagatgga 612 0 
gaagacgttc caaccacgtc ttcaaagcaa 618 0 
gacactctcg tctactccaa gaatatcaaa 6240 
gagacttttc aacaaagggt aatatcggga 63 0 0 
tgtcacttca tcaaaaggac agtagaaaag 63 60 
gataaaggaa aggctatcgt tcaagatgcc 6420 
ccacccacga ggagcatcgt ggaaaaagaa 64 80 
gattgatgtg atatctccac tgacgtaagg 6540 
gaccttcctc tatataagga agttcatttc 660 0 
ctctctctac aaatctatct ctctcgagct 6660 
aaaaagcctg aactcaccgc gacgtctgtc 672 0 
gtctccgacc tgatgcagct ctcggagggc 6780 
ggagggcgtg gatatgtcct, gcgggtaaat 6840 
tatgtttatc ggcactttgc atcggccgcg 6900 
gagtttagcg agagcctgac ctattgcatc 6960 
gacctgcctg aaaccgaact gcccgctgtt 7 02 0 
atcgctgcgg ccgatcttag ccagacgagc 70 80 
ggtcaataca ctacatggcg tgatttcata 7140 
tggcaaactg tgatggacga caccgtcagt 72 00 
atgctttggg ccgaggactg ccccgaagtc 72 60 
aacaatgtcc tgacggacaa tggccgcata 732 0 
ttcggggatt cccaatacga ggtcgccaac 73 80 
atggagcagc agacgcgcta cttcgagcgg 7440 
ctccgggcgt atatgctccg cattggtctt 7500 
aatttcgatg atgcagcttg ggcgcagggt 7560 
gggactgtcg ggcgtacaca aatcgcccgc 762 0 
gtagaagtac tcgccgatag tggaaaccga 76 80 
fcagagtagat gccgaccgga tctgtcgatc 774 0 
tgagtagttc ccagataagg gaattagggt 7800 
tataagaaac ccttagtatg tatttgtatt 7 8 60 
attcctaaaa ccaaaatcca gtactaaaat 7920 
tcagatcaag cttggcactg gccgtcgttt 79 80 
ttacccaact taatcgcctt gcagcacatc 8040 
aggcccgcac cgatcgccct tcccaacagt 8100 
gcttgagctt ggatcagatt gtcgtttccc 8160 
ggcgggaaac gacaatctga tcatgagcgg 82 2 0 
ccgatgacgc gggacaagcc gttttacgtt 82 80 
gccactcagc cgcgggtttc tggagtttaa 8340 
ttgcgcgttc aaaagtcgcc taaggtcact 84 0 0 
gctccactga cgttccataa attcccctcg 84 60 
aatccaaata atctgcaccg gatctcgaga 8520 
tgactagtaa aggagaagaa cttttcactg 85 80 
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gagttgtccc aattcttgtt gaattagatg gtgatgttaa tgggcacaaa ttttctgtca 864 0 
gtggagaggg tgaaggtgat gcaacatacg gaaaacttac ccfctaaattt atttgcacta 87 0 0 
ctggaaaact acctgttccg tggccaacac ttgtcactac tttctcttat ggtgttcaat 8760 
gcttttcaag atacccagat catatgaagc ggcacgactt cttcaagagc gccatgcctg 882 0 
agggatacgt gcaggagagg accatcttct tcaaggacga cgggaactac aagacacgtg 8880 
ctgaagtcaa gtfctgaggga gacaccctcg tcaacaggat cgagcttaag ggaatcgatt 894 0 
tcaaggagga cggaaacatc ctcggccaca agttggaata caactacaac tcccacaacg 900 0 
tatacatcat ggccgacaag caaaagaacg gcatcaaagc caacttcaag acccgccaca 90 60 
acatcgaaga cggcggcgtg caactcgctg atcattatca acaaaatact ccaattggcg 9120 
atggccctgt ccttttacca gacaaccatt acctgtccac acaatctgcc ctttcgaaag 918 0 
atcccaacga aaagagagac cacatggtcc fctcttgagtt tgtaacagct gctgggatta 924 0 
cacatggcat ggatgaacta tacaaagcta gccaccacca ccaccaccac gtgtgaatfcg 93 0 0 
gtgaccagct cgaatttccc cgatcgttca aacatttggc aataaagttt cttaagattg 93 6 0 
aatcctgttg ccggtcttgc gatgattatc atataatttc tgttgaatta cgttaagcat 942 0 
gtaataatta acatgtaatg catgacgtta tttatgagat gggtttttat gattagagtc 94 8 0 
ccgcaattat acatttaata cgcgatagaa aacaaaatat agcgcgcaaa ctaggataaa 954 0 
ttatcgcgcg cggtgtcatc tatgttacta gatcgggaat taaacfcatca gtgtttgaca 960 0 
ggatatattg gcgggtaaac ctaagagaaa agagcgttta ttagaataac ggatatttaa 9660 
aagggcgtga aaaggtttat ccgttcgtcc atttgtatgt gcatgccaac cacagggttc 972 0 
ccctcgggat caaagtactt tgatccaacc cctccgctgc tatagtgcag tcggcttctg 97 80 
acgttcagtg cagccgtctt ctgaaaacga catgtcgcac aagtcctaag ttacgcgaca 984 0 
ggctgccgcc ctgccctttt cctggcgfctt tcttgtcgcg tgfcfcttagfcc gcataaagta 9900 
gaatacttgc gactagaacc ggagacatta cgccatgaac aagagcgccg ccgctggcct 9960 
gctgggctat gcccgcgtca gcaccgacga ccaggacttg accaaccaac gggccgaact 10 02 0 
gcacgcggcc ggctgcacca agctgttttc cgagaagatc accggcacca ggcgcgaccg 1O080 
cccggagctg gccaggatgc ttgaccacct acgccctggc gacgttgtga cagtgaccag 10140 
gctagaccgc ctggcccgca gcacccgcga cctactggac attgccgagc gcatccagga 102 0 0 
ggccggcgcg ggcctgcgta gcctggcaga gccgtgggcc gacaccacca cgccggccgg 10260 
ccgcatggtg ttgaccgtgt tcgccggcat tgccgagttc gagcgttccc taatcatcga 1032 0 
ccgcacccgg agcgggcgcg aggccgccaa ggcccgaggc gtgaagtttg gcccccgccc 103 80 
taccctcacc ccggcacaga tcgcgcacgc ccgcgagctg atcgaccagg aaggccgcac 10 44 0 
cgtgaaagag gcggctgcac tgcttggcgt gcatcgctcg accctgtacc gcgcacttga 1050 0 
gcgcagcgag gaagtgacgc ccaccgaggc caggcggcgc ggtgccttcc gtgaggacgc 105 60 
attgaccgag gccgacgccc tggcggccgc cgagaatgaa cgccaagagg aacaagcatg 10 62 0 
aaaccgcacc aggacggcca ggacgaaccg tttttcatta ccgaagagat cgaggcggag 10 6 80 
atgatcgcgg ccgggtacgt gttcgagccg cccgcgcacg tctcaaccgt gcggctgcat 1074 0 
gaaatcctgg ccggtttgtc tgatgccaag ctggcggcct ggccggccag cttggccgct 10 80 0 
gaagaaaccg agcgccgccg tctaaaaagg tgatgtgtat ttgagtaaaa cagcfctgcgt 10 860 
catgcggtcg cbgcgtatat gatgcgatga gtaaataaac aaatacgcaa ggggaacgca 1092 0 
fcgaaggttat cgctgtactt aaccagaaag gcgggtcagg caagacgacc atcgcaaccc 10 980 
atctagcccg cgccctgcaa ctcgccgggg ccgatgttct gttagtcgat tccgatcccc 1104 0 
agggcagtgc ccgcgattgg gcggccgtgc gggaagatca accgctaacc gttgtcggca 1110 0 
tcgaccgccc gacgattgac cgcgacgtga aggccatcgg ccggcgcgac ttcgtagtga 11160 
tcgacggagc gccccaggcg gcggacttgg ctgtgtccgc gatcaaggca gccgacttcg 11220 
tgctgattcc ggtgcagcca agcccttacg acatatgggc caccgccgac ctggtggagc 112 80 
tggttaagca gcgcattgag gtcacggatg gaaggctaca agcggccttt gtcgtgtcgc 1134 0 
gggcgatcaa aggcacgcgc atcggcggtg aggttgccga ggcgctggcc gggtacgagc 114 00 
tgcccattct tgagtcccgt atcacgcagc gcgtgagcta cccaggcact gccgccgccg 114 60 
gcacaaccgt tcttgaatca gaacccgagg gcgacgctgc ccgcgaggtc caggcgctgg 11520 
ccgctgaaat taaatcaaaa ctcatttgag ttaatgaggt aaagagaaaa tgagcaaaag 11580 
cacaaacacg ctaagtgccg gccgtccgag cgcacgcagc agcaaggctg caacgttggc 11640 
cagcctggca gacacgccag ccatgaagcg ggtcaacttt cagttgccgg cggaggatca 11700 
caccaagctg aagatgtacg cggtacgcca aggcaagacc attaccgagc tgctatctga 11760 
atacatcgcg cagctaccag agtaaatgag caaatgaata aatgagtaga tgaattttag 1182 0 
cggctaaagg aggcggcafcg gaaaatcaag aacaaccagg caccgacgcc gtggaatgcc 11880 
ccatgtgtgg aggaacgggc ggttggccag gcgtaagcgg ctgggttgtc tgccggccct 11940 
gcaatggcac tggaaccccc aagcccgagg aatcggcgtg acggtcgcaa accatccggc 12 000 
ccggtacaaa tcggcgcggc gcfcgggfcgat gacctggtgg agaagttgaa ggccgcgcag 12 060 
gccgcccagc ggcaacgcat cgaggcagaa gcacgccccg gtgaatcgtg gcaagcggcc 1212 0 
gctgatcgaa tccgcaaaga atcccggcaa ccgccggcag ccggtgcgcc gtcgattagg 12180 
aagccgccca agggcgacga gcaaccagat tttttcgttc cgatgctcta tgacgtgggc 12240 
acccgcgata gtcgcagcat catggacgtg gccgttttcc gtctgtcgaa gcgtgaccga 123 00 
cgagctggcg aggtgafcccg ctacgagctt ccagacgggc acgtagaggt ttccgcaggg 1236 0 
ccggccggca tggccagtgt gtgggattac gacctggtac tgatggcggt ttcccatcta 12420 
accgaatcca tgaaccgata ccgggaaggg aagggagaca agcccggccg cgtgttccgt 124 8 0 
ccacacgttg cggacgtact caagttctgc cggcgagccg atggcggaaa gcagaaagac 1254 0 
gacctggtag aaacctgcat tcggttaaac accacgcacg ttgccatgca gc ~ 12592 
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<210> 7 
<211> 3357 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pGEMEasyNOS Plasmid 



<400> 7 

tatcactagt 

tggatgcata 

fcagctgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgfcaggcg 

acagtatttg 

tcttgatccg 

attacgcgca 

gctcagtgga 

ttcacctaga 

taaacttggt 

ctatttcgtt 

ggcttaccat 

gatttatcag 

ttatccgcct 

gttaatagtt 

tttggtatgg 

atgttgtgca 

gccgcagtgt 

tccgtaagat 

atgcggcgac 

agaactttaa 

ttaccgctgt 

tcttttactt 

aagggaataa 

tgaagcattt 

aataaacaaa 

aataccgcac 

ttgttaaaat 

atcggcaaaa 

gtttggaaca 

gtctatcagg 

aggtgccgta 

ggaaagccgg 

gcgctggcaa 

ccgctacagg 

tgcgggcctc 

gttgggtaac 

aafcacgactc 

gccgcgggaa 

gactctaatt 

atatttgcta 

gtatgtgctt 

ggttctgtca 

tgactccctt 



gaattcgcgg 
gcttgagtat 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
agatgcgtaa 
tcgcgttaaa 
tcccttataa 
agagtccact 
gcgatggccc 
aagcactaaa 
cgaacgtggc 
gtgtagcggt 
gcgcgtccat 
ttcgctatta 
gccagggttt 
actatagggc 
ttcgattctc 
ggataccgag 
gctgatagtg 
agctcattaa 
gttccaaacg 
aattctccgc 



ccgcctgcag 
tctatagtgt 
ttgttatccg 
gggtgcctaa 
gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gac.tggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
ggagaaaata 
tttttgttaa 
atcaaaagaa 
attaaagaac 
actacgtgaa 
tcggaaccct 
gagaaaggaa 
cacgctgcgc 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattgggcc 
gagatccggt 
gggaatttat 
accttaggcg 
actccagaaa 
taaaacggct 
tcatgatcag 



gtcgaccata 
cacctaaata 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 

gggcgctctt 
gcggtatcag 
ggaaagaaca 
cfcggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccfca 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggccgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
ccgcatcagg 
atcagctcat 
tagaccgaga 
gtggactcca 
ccatcaccct 
aaagggagcc 
gggaagaaag 
gtaaccacca 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacgtcgca 
gcagattatt 
ggaacgtcag 
acttttgaac 
cccgcggctg 
tgtcccgcgt 
attgtcgttfc 



tgggagagct 
gcttggcgba 
cacacaacat 
aactcacatt 
agctgcatta 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgtfccc 

tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttfctgtttg 
tcttttctac 
tgagattatc 
caatcfcaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgfcggfcgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
aaattgtaag 
tttttaacca 
tagggttgag 
acgtcaaagg 
aatcaagttt 
cccgatttag 
cgaaaggagc 
cacccgccgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tgctcccggc 
tggattgaga 
tggagcattt 
gcgcaataat 
agtggctcct 
catcggcggg 
cccgccttca 



cccaacgcgt 
atcatggtca 
acgagccgga 
aattgcgttg 
at gaat cggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaaga 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggctcca 
tcctgcaact 
tagttcgcca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
tgcggtgtga 
cgttaatatt 
ataggccgaa 
tgttgttcca 
gcgaaaaacc 
tttggggtcg 
agcttgacgg 
gggcgctagg 
gcttaatgcg 
gggcgatcgg 
aggcgattaa 
agtgaattgt 
cgccatggcg 
gtgaatatga 
ttgacaagaa 
ggtttctgac 
tcaacgttgc 
ggtcataacg 
gtctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3357 
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<211> 10122 

<212> DMA 

<213> Artificial Sequence 
<220> 

<223> pl302NOS Plasmid 



<400> 8 

catggtagat 

tgaattagat 

fcgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgat aga 

ctatgttact 

c c t aagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 

tcctggcgtt 

cggagacatt 

agcaccgacg 

aagctgtttt 

cttgaccacc 

agcacccgcg 

agcctggcag 

ttcgccggca 

gaggccgcca 

atcgcgcacg 

ctgcttggcg 

cccaccgagg 

ctggcggccg 

aggacgaacc 

t g 1 1 cgagcc 

ctgatgccaa 

gtctaaaaag 

tgatgcgatg 

taaccagaaa 

actcgccggg 

ggcggccgtg 

ccgcgacgtg 

ggcggacttg 

aagcccttac 

ggtcacggat 

catcggcggt 

tatcacgcag 

agaacc cgag 

actcatttga 

ggccgtccga 

gccatgaagc 

gcggtacgcc 

gagfcaaatga 

ggaaaat c aa 

cggttggcca 

caagcccgag 

cgctgggtga 



ctgactagta 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
attfcatgaga 
aaacaaaata 
agatcgggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgca 
ttcttgtcgc 
acgccatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggcccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttttcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agtaaataaa 

ggcgggtcag 

gccgatgttc 
cgggaagatc 
aaggccatcg 
gctgtgtccg 
gacatatggg 
ggaaggctac 
gaggtfcgccg 
cgcgtgagct 
ggcgacgctg 
gttaatgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgacctggtg 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 
tagcgcgcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 
gtgttttagt 
caagagcgcc 
gaccaaccaa 
caccggcacc 
cgacgttgtg 
cattgccgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
acgccaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgca 
gcaagacgac 
tgttagtcga 
aaccgctaac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aagcggcctt 
aggcgctggc 
acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gacggtcgca 
gagaagttga 



acttttcact 
attttctgtc 
tafcttgcact 
tggtgttcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgat tagagt 
actaggataa 
agtgtttgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 
cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggccgca 
cgcgcacttg 
cgtgaggacg 
gaacaagcat 
tcgaggcgga 
tgcggctgca 
gcttggccgc 
acagcttgcg 

aggggaacgc 

catcgcaacc 
ttccgatccc 
cgfctgtcggc 
cttcgtagtg 
agccgacttc 
cctggtggag 
tgtcgtgtcg 
cggg t acgag 
tgccgccgcc 
ccaggcgctg 
atgagcaaaa 
gcaacgttgg 
gcggaggatc 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aaccatccgg 
aggccgcgca 



ggagttgtcc 

agtggagagg 

actggaaaac 

tgcttttcaa 

gagggatacg 

gctgaagtca 

ttcaaggagg 

gtatacatca 

aacatcgaag 

gatggccctg 

gatcccaacg 

acacatggca 

ggtgaccagc 

gaatcctgtt 

tgtaataatt 

cccgcaatta 

attatcgcgc 

aggatafcatfc 

aaagggcgtg 

cccctcggga 

gacgttcagt 

aggctgccgc 

agaatacttg 

tgctgggcta* 

tgcacgcggc 

gcccggagct 

ggctagaccg 

aggccggcgc 

gccgcatggt 

accgcacccg 

ctaccctcac 

ccgtgaaaga 

agcgcagcga 

cattgaccga 

gaaaccgcac 

gatgatcgcg 

tgaaatcctg 

tgaagaaacc 

tcatgcggtc 

atgaaggtta 

catctagccc 

cagggcagtg 

atcgaccgcc 

atcgacggag 

gtgctgattc 

ctggttaagc 

cgggcgatca 

ctgcccattc 

ggcacaaccg 

gccgctgaaa 

gcacaaacac 

ccagcctggc 

acaccaagct 

aatacatcgc 

gcggctaaag 

cccatgtgtg 

tgcaatggca 

cccggtacaa 

ggccgcccag 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
a a a ag agaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacattfcaat 
gcggtgtcat 

ggcgggtaaa 

aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 
cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagcgggcgc 
cccggcacag 
ggcggctgca 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
gccgggtacg 
gccggtttgt 
gagcgccgcc 
gctgcgtata 
tcgctgtact 
gcgccctgca 
cccgcgattg 
cgacgattga 
cgccccaggc 
cggtgcagcc 
agcgcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
gctaagtgcc 
agacacgcca 
gaagatgtac 
gcagctacca 
gaggcggcat 
gaggaacggg 
ctggaacccc 
atcggcgcgg 
cggcaacgca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 
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2040 
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2760 

2820 

2880 

2940 

3000 

3060 
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3240 

3300 

3360 

3420 

3480 

3540 
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tcgaggcaga 
aatcccggca 
agcaaccaga 
tcatggacgt 
gctacgagct 
fcgtgggatfca 
accgggaagg 
tcaagttctg 
ttcggttaaa 
tggtgacggfc 
ccgggcggcc 
aaggcaagaa 
tcggccgttfc 
tgttcaagac 
ccgtgcgcaa 
ggcaggctgg 
ccggttccta 
gaaaaggt cfc 
accggaaccc 
tgactgatat 
aaactcttaa 
tgcaaaaagc 
ctatcgcggc 
gcggacaagc 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
ttaactatgc 
cgcacagatg 
actcgctgcg 
tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agt caaaaaa 
agaaggc aat 
ttactttgcc 
gttcctcttc 
gagtgtcttc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattttgaaa 
taccctccgc 
agcatcggta 
cggacfcgatg 
tgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 



agcacgcccc 
accgccggca 
ttttttcgtt 
ggccgttttc 
t ccagacggg 
cgacctggta 
gaagggagac 
ccggcgagcc 
caccacgcac 
atccgagggt 
ggagtacatc 
cccggacgtg 
tctctaccgc 
gatctacgaa 
gcfcgatcggg 
cccgatccta 
atgtacggag 
ctttcctgtg 
gtacattggg 
aaaagagaaa 
aacccgcctg 
gcctaccctt 
cgctggccgc 
cgcgccgtcg 
gtgatgacgg 
aagcggatgc 
ggggcgcagc 
ggcatcagag 
eg t aaggag a 
cteggtegtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
cggfcgcfcaca 
tggtatctgc 
eggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgfctcc 
gttttcattt 
tacgeagegg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 



ggtgaatcgt 
gccggtgcgc 
ccgatgctct 
cgtctgtcga 
caegfcagagg 
ctgatggcgg 
aagcccggcc 
gatggcggaa 
gttgccatgc 
gaagccfctga 
gagatcgagc 
ctgacggttc 
ctggcacgcc 
cgcagtggca 
tcaaatgacc 
gtcatgeget 
cagatgetag 
gatagcacgt 
aacccaaagc 
aaaggegatt 
gcctgtgcat 
cggtcgctgc 
tcaaaaatgg 
ccactcgacc 
tgaaaacctc 
egggage aga 
catgacccag 
cagat tgt ac 
aaataccgea 
cggctgcggc 
ggggataacg 
aaggc cgcg t 
cgacgctcaa 
cctggaagct 
gcctttctcc 
fccggtgtagg 
cgctgcgcct 
ccacfcggcag 
gagttcttga 
getctgetga 
accaccgctg 
ggatctcaag 
tcacgttaag 
taatatttta 
fcacfcgttctt 
ttgtccgccc 
aagatgttgc 
gtctttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgecat 
aggacctttg 
acatcatagg 
tctcccacca 
fcattfcttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatafcattgt 
tttaatgtac 
tggttttagg 
agggtttctt 



ggcaagegge 
cgtcgattag 
atgacgtggg 
agcgtgaccg 
tttcegcagg 
tttcccatct 
gcgtgttccg 
agcagaaaga 
agegtacgaa 
fcfcagccgcfca 
tagctgattg 
accccgatta 
gcgccgcagg 
gege egg aga 
tgccggagta 
accgcaacct 
ggcaaattgc 
acat tgggaa 
cgtacattgg 
tttccgccta 
aactgtctgg 
gctcccfcacg 
ctggcctacg 
gc eggege c c 
tgacacatgc 
caagcccgtc 
teaegtageg 
tgagagtgca 
tcaggcgctc 
gageggtate 
caggaaagaa 
tgctggcgtt 
gt cagaggtg 
ccctcgtgcg 
ettegggaag 
tcgttcgctc 
tatceggtaa 
cagccactgg 
agtggtggcc 
agecagttae 
gtagcggtgg 
aagatccttt 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
catcggccag 
tegtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgea 
gttggcgtat 
aacgetctgt 
cccggcagct 
ettacaaegg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgetcaa 



egctgatega 
gaagccgccc 
cacccgcgat 
acgagctggc 
gccggccggc 
aaccgaatcc 
tccacacgtt 
cgacctggta 
gaaggccaag 
caaga t eg t a 
gatgtacege 
ctttttgatc 
caaggcagaa 
gttcaagaag 
cgatttgaag 
gatcgagggc 
cc tagcaggg 
cccaaagccg 
gaaceggtea 
aaactcttta 
ccagcgcaca 
ccccgccgct 
gecaggcaat 
acatcaaggc 
agctcccgga 
agggegegtc 
atageggagt 
ecatatgegg 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
etategtett 
taacaggatt 
taactaegge 
etteggaaaa 
tttttttgtt 
gatcttttct 
catgeattet 
atcaggcttg 
ctccctgatc 
cccaagatca 
gtcgccgtgg 
c t cgcgcgga 
ategttatte 
acaatccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataceggctg 
cttagcagga 
caattceggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
categttaca 
tagttgccgt 
ctctcccgct 
cgagctgccg 
aattgacget 
ccgaattaat 
tttattgata 
cacatgagcg 



ateegcaaag 
aagggcgacg 
agtegcagea 
gaggtgatcc 
atggccagtg 
atgaaccgat 
geggaegtae 
gaaacctgea 
aacggccgcc 
aagagcgaaa 
gagatcacag 
gatcccggca 
gccagatggt 
ttctgtttca 
gaggaggegg 
gaagcatccg 
gaaaaagg t c 
tacattggga 
cacatgtaag 
aaacttatta 
gc cgaagagc 
tcgcgtcggc 
ctaccagggc 
accctgcctc 
gaeggtcaca 
agcgggtgtt 
gtatactggc 
tgtgaaatac 
tcgctcactg 
aaggcggtaa 
aaaggecage 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
ageagagega 
tacactagaa 
agagttggta 
tgeaagcage 
aeggggtctg 
aggtactaaa 
atccccagta 
gaccggacgc 
ataaagecac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agecatagea 
teegtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
ateaacatge 
tcttccgaat 
gacgccgtcc 
gteggggage 
tagacaactt 
tegggggate 
gaagtatttt 
aaaccctata 
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ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tcfcgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
t agaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
aacgacaafcc 
cgcgggacaa 
agccgcgggt 
ttcaaaagtc 
tgacgttcca 
ataatctgca 



ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagt agcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagc t gca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 
actgtccttt 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttcfcg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
tgatcatgag 
gccgttttac 
ttctggagtt 
gcctaaggtc 
taaattcccc 
ccggatctcg 



tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
fccacagtttg 
tagtgtatfcg 
cggccgcagc 
tttcaggcag 
fcctcgcfcaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtfc 
agacfcgtatc 
gctgctctag 
ctggcacgac 
ttagctcact 
tggaattgtg 
cgagctcggt 
cggagaatta 
gtttggaact 
taatgagcta 
actatcagct 
fccggfcatcca 
agaatcgaat 
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tcacacatta 
ctatttcttt 
tacacagcca 
cccggcfcccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 

atcgcaatga 
acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
agggagtcac 
gacagaaccg 
agcacatacg 
agcaaatatt 
attagagtct 
tcccgcggcc 



ttatggagaa 
gccctcggac 
tcggtccaga 
gafccggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cafcfcgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggc a a t gga 
ctttggtctt 
accatgttat 
acgatgctcc 
acgatagcct 
gtccttttga 
accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagact 
gttatgaccc 
caacgttgaa 
tcagaaacca 
tcttgtcaaa 
catattcact 
gc 



actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
c cggaat egg 
catcggcgca 
cacgagattc 
tcagaaactt 
ccggatctgc 
gfcccfcctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
gaaggcggga 
ccgccgatga 
ggagccactc 
ttattgcgcg 
aatgctccac 
ctcaatccaa 



7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10122 



<210> 9 
<211> 621 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> N. tabacum rDNA intergnic spacer (IGS) sequence 
<300> 

<308> Genbank #Y08422 
<309> 1997-10-31 



<400> 9 

gtgetageca 

gctggcggfcg 

tgcagcggtg 

gfctattggtg 

ttacatattt 

tgttttataa 

ttctccattg 

attttttcgt 

fcfctacaatgt 

tttggtgttg 



atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 



agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 



cacaatgaat 
gageggtagt 
tggtggttgt 
tataataata 
tttgtatfctt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 



gfcfcggtggtt 
gateggegat 
cacaatggag 
ttaagtattt 
taaatagttt 
tacttgatgt 
ttttttttgt 
tatttttgta 
gttgtacttc 
catgtctact 



ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttategtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgt.cactt 



60 

12 0 

180 

240 

300 

360 

420 

480 

540 

600 
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gggttttttt ttttaagaca t 



621 



<210> 10 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> PCR Primer NTIGS-F1 

<400> 10 

gtgctagcca atgtttaaca agatg 25 

<210> 11 
<211> 28 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-RI 



<210> 12 

<211> 233 

<212> DNA 

<213> Mus musculus 

<30O> 

<3 08> Genbank #V00846 
<309> 1989-07-06 

<400> 12 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 12 0 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 233 

<210> 13 

<211> 31 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-F1 



<210> 14 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer MSAT-RI 
<400> 14 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 15 
<211> 277 
<212> DNA 

<213> Artificial Sequence 



<400> 11 

atgtcttaaa aaaaaaaacc 



caagtgac 



28 



<400> 13 

aataccgcgg aagcttgacc tggaatatcg c 



31 



<220> 
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<223> Nopaline Synthase Promoter Fragment 
<300> 

<308> Genebank #U09365 
<309> 1997-10-17 

<400> .15 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 

tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 

aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 18 0 

attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 24 0 

gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 16 

<211> 1812 

<212> DNA 

<213> Escherichia coli 

<220> 
<221> CDS 

<222> (1) . . . (1812) 

<223> Beta- glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 16 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 48 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 
15 10 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 96 
Gly Leu Trp Ala Phe Ser Leu. Asp Arg Glu Asn Cys Gly lie Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 144 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala lie Ala Val Pro 
35 40 t 45 

ggc agt ttt aac gat cag ttc gec gat gca gat att cgt aat tat gcg 192 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp lie Arg Asn Tyr Ala 
50 55 60 

ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 24 0 
Gly Asn Val Trp Tyr Gin Arg Glu Val Phe He Pro Lys Gly Trp Ala 
65 70 75 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 288 
Gly Gin Arg He Val Leu Arg Phe Asp Ala Val Thr His' Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 33 6 
Val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 110 

cca ttt gaa gec gat gtc acg ccg tat gtt att gec ggg aaa agt gta 3 84 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val lie Ala Gly Lys Ser Val 
115 120 125 

cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 432 
Arg lie Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr lie Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 48 0 
Pro Gly Met Val He Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 
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ttc cat gat ttc ttt aac tat gcc gga ate cat cgc age gta atg etc 
Phe His Asp Phe Phe Asn Tyr Ala Gly lie His Arg Ser Val Met Leu 
165 170 175 



528 



tac ace acg ccg aac ace tgg gtg gac gat ate ace gtg gtg acg cat 
Tyr Thr Thr Pro Asn Thr Trp Val Asp Asp lie Thr Val Val Thr His 
180 185 190 



576 



gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 200 205 



624 



aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 
Asn Gly Asp Val Ser Val Gin Leu Arg Asp Ala Asp Gin Gin Val Val 
210 " 215 220 



672 



gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 
Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 



720 



etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 
Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 



768 



aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 
Lys Ser Gin Thr Glu Cys Asp lie Tyr Pro Leu Arg Val Gly lie Arg 
260 265 270 



816 



tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 
Ser Val Ala Val Lys Gly Glu Gin Phe Leu lie Asn His Lys Pro Phe 
275 280 285 



864 



tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 
Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 



912 



gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 
Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 ^ 310 315 320 



960 



att ggg gcc aac tec tac cgt acc teg cat tac cct tac get gaa gag 
lie Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 



1008 



atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 
Met Leu Asp Trp Ala Asp Glu His Gly lie Val Val lie Asp Glu Thr 
340 345 350 



1056 



get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 
Ala Ala Val Gly Phe Asn Leu Ser Leu Gly lie Gly Phe Glu Ala Gly 
355 360 365 



1104 



aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 
Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 
370 375 380 



1152 



cag caa gcg cac tta cag gcg att aaa gag ctg ata gcg cgt gac aaa 
Gin Gin Ala His Leu Gin Ala lie Lys Glu Leu lie Ala Arg Asp Lys 
385 390 395 400 



1200 



aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 
Asn His Pro Ser Val Val Met Trp Ser lie Ala Asn Glu Pro Asp Thr 
405 410 415 



1248 



cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 
Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 



1296 
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420 



425 



430 



cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 
Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
435 440 445 



1344 



tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc 13 92 

Cys Asp Ala His Thr Asp Thr lie Ser Asp Leu Phe Asp Val Leu Cys 
450 455 460 

ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 144 0 

Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 

465 ~ 470 475 480 



gca gag aag gta ctg gaa aaa gaa ctt ctg gec tgg cag gag aaa ctg 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
485 490 495 



1488 



cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gec ggg 

His Gin Pro lie lie lie Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 

500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 

Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 

515 ' 520 525 



1536 



1584 



tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gec gtc gtc 
Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 ' 535 540 



1632 



ggt gaa cag gta tgg aat ttc gec gat ttt gcg acc teg caa ggc ata 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly lie 
545 550 555 560 



1680 



ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 172 8 
Leu Arg Val Gly Gly Asn Lys Lys Gly lie Phe Thr Arg Asp Arg Lys 
565 570 575 

ccg aag teg gcg get ttt ctg ctg caa aaa cgc tgg act ggc atg aac 1776 
Pro Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn 
580 585 590 

ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga 1812 
Phe Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin * 
595 600 

<210> 17 
<211> 603 
<212> PRT 

<213> Escherichia coli 
<300> 

<3 08> Genbank #S69414 
<309> 1994-09-23 



<400> 17 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 

15 10 15 

Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly lie Asp Gin 

20 25 30 

Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala lie Ala Val Pro 

35 40 45 

Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp lie Arg Asn Tyr Ala 

50 55 60 

Gly Asn Val Trp Tyr Gin Arg Glu Val Phe lie Pro Lys Gly Trp Ala 
65 70 75 80 
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±113 




t on 






Gin 


VctJ- 


Va 1 
v d _L 


HXd 


2 05 












Vdl 


vai 


v ai 


Asn 


Pro 


xllS 








240 


Cys 


Val 


Thr 


Ala 










_ 

Val 


_ 

Gly 


lie 


Arg 




0 "7 n 






Hi s 


Lys 


Pro 


Pne 


2 8 5 








Leu 


Arg 


vjiy 


Lys 


Leu 


Met 


Asp 


Trp 








320 


Tyr 


Ala 


Glu 


Glu 










He 


Asp 


Glu 


Thr 




•a c r\ 
3 r> U 






Pne 


Glu 


Ala 


Gly 


3 0 0 








Asn 


Gly 


Glu 


Thr 


Ala 


Arg 


Asp 


Lys 








400 


Glu 


Pro 


Asp 


Thr 






41b 




ax. a 


IrrJLU 


Aia 


x xir 




4 3 O 






Asn 


Val 


Met 


Phe 


44 5 








Asp 


Val 


Leu 


Cys 


Asp 


Leu 


Glu 


Thr 








480 


Gin 


Glu 


Lys 


Leu 






495 




Thr 


Leu 


Ala 


Gly 




510 




Tyr 


Gin 


Cys 


Ala 


525 








Ser 


Ala 


Val 


Val 


Ser 


Gin 


Gly 


He 








560 


Arg 


Asp 


Arg 


Lys 






575 




Thr 


Gly 


Met 


Asn 




590 
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<210> 18 

<211> 277 

<2X2> DNA 

<213> Artificial Sequence 



<220> 

<223> Nopal ine Syntha 
<;300> 

<3 0 8> Genbank #U093 65 
<309> 1995-10-17 



se Terminator Sequence 



<400> 18 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 6 0 
tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 
aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 
attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 240 
gcgcgcggtg tcatctatgt tactagatcg ggaattc 



277 



<210> 19 
<211> 3438 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT3 8attBZeo Plasmid 



<400> 19 

tcgaccctct 

gtcgtgactg 

tcgccagctg 

gcctgaatgg 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 
tggcgaacta 
agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ct cat at at a 
aagattgtat 
aatttttgtt 
aaatcaaaag 
ctattaaaga 
ccactacgtg 
aatcggaacc 
gaaaggaagg 
cgctgcgcgt 
atctaggtga 
ttccactgag 
ctgcgcgtaa 
ccggatcaag 
ccaaatactg 
ccgcctacat 
tcgtgtctta 
tgaacggggg 
tacctacagc 



agtcaaggcc 
ggaaaaccct 
gcgtaatagc 
cgaatggcgc 
tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
c t aaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg 



ttaagtgagt 
ggcgttaccc 
gaagaggccc 
ttcgcttggt 
cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
t cggaggac c 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 



cgtattacgg 
aacttaatcg 
gcaccgatcg 
aataaagccc 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 



actggccgtc 
ccttgcagca 
cccttcccaa 
gcttcggcgg 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 

cgaggtgccg 

ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
t accagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 



gttttacaac 
catccccctt 
cagttgcgca 
gctttttttt 
tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 
aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtagcggtca 
gcgtaaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 
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tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tgctttttta 
ccggtgctca 
ttctcccggg 
ttcatcagcg 
cgcggcctgg 
gcctccgggc 
cgcgacccgg 
cgagatttcg 
gacgccggct 
aacttgttta 
aataaagcat 
tatcatgtct 



gcggcagggt 
tttatagtcc 
c aggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ccgcgcgcga 
acttcgtgga 
cggtccagga 
acgagctgta 
cggccatgac 
ccggcaactg 
attccaccgc 
ggatgatcct 
ttgcagcfcta 
ttttttcact 
gtataccg 



cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
cgtcgccgga 
ggacgacttc 
ccaggtggtg 
cgccgagtgg 
cgagatcggc 
cgtgcacttc 
cgccttctat 
ccagcgcggg 
taatggttac 
gcattctagt 



gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatccatgg 
gcggtcgagt 
gccggtgtgg 
ccggacaaca 
tcggaggtcg 
gagcagccgt 
gtggccgagg 
gaaaggttgg 
gatctcatgc 
aaataaagca 
tgtggtttgt 



gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 

tgtggaattg 

aagctacgta 
ttctctgcag 
ccaagttgac 
tctggaccga 
tccgggacga 
ccctggcctg 
tgtccacgaa 
gggggcggga 

agcaggactg 
gcttcggaat 
tggagttctt 
atagcatcac 
ccaaactcat 



agggggaaac 

tcgatttttg 
ctttttacgg 
ctcattaggc 
t gagcgga t a 
atacgactca 
gattgaagcc 
cagtgccgtt 
ccggctcggg 
cgtgaccctg 
ggtgtgggtg 
cttccgggac 
gttcgccctg 
acacgtgcta 
cgttttccgg 
cgcccacccc 
aaatttcaca 
caatgtatct 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3438 



<210> 20 
<211> 3451 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Hindi 1 1 Fragment containing the bet a- glucuronidase 
cooling sequence , the rDNA intergenic spacer, and 
the Mastl sequence 



<400> 20 

aagcttgacc 

ttaggacgtg 

gtaggacgtg 

acgacttgaa 

gactccgcgg 

gttggtggtt 

gatcggcgat 

cacaatggag 

ttaagtattt 

taaatagttt 

tacttgatgt 

ttttttttgt 

tatttttgta 

gttgtacttc 

catgtctact 

tagactgaag 

tgacccccgc 

gttgaaggag 

aaaccattat 

gtcaaaaatg 

ttcactctca 

ttcactagtg 

cccgtgaaat 

gaattgagca 

gcagttttaa 

atcagcgcga 

atgcggtcac 

gcggctatac 

gtatcacagt 

ttaccgacga 

ggatccatcg 

tggtgacgca 

atggtgatgt 

gca ccagcgg 

tctatgaact 

tcggcatccg 



tggaatatcg 
aaatatggcg 
gaatatggca 
aaatgacgaa 
gaattcgatt 

ggtggtcgtg 

ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 
gcgggaaacg 
cgatgacgcg 
c cac t cage c 
tgcgcgt t ca 
ctccactgac 
atccaaataa 
gatccccggg 
caaaaaactc 
gcgttggtgg 
cgatcagttc 
agtctttata 
t cat t aegge 
gecatttgaa 
ttgtgtgaac 
aaacggcaag 
cagegtaatg 
tgtcgcgcaa 
cagcgttgaa 
gaetttgeaa 
gtaegtcaca 
gtcagtggca 



cgagtaaact 
aggaaaactg 
agaaaactga 
atcactaaaa 
gtgetageca 
gctggcggtg 
tgcagcggtg 
gttattggtg 
ttacatattt 
tgttttataa 
ttctccattg 
attttttcgt 
tttacaatgt 
tttggtgttg 
gggttttttt 
acaatctgat 
ggacaagccg 
gcgggtttct 
aaagtcgect 
gttccataaa 
tctgcaccgg 
tacggtcagt 
gaeggectgt 
gaaagcgcgt 
gecgatgeag 
ccgaaaggtt 
aaagtgtggg 
gecgatgtea 
aacgaactga 
aaaaagcagt 
ctctacacca 
gactgtaacc 
ctgcgtgatg 
gtggtgaatc 
gccaaaagcc 
gtgaagggcg 



gaaaatcacg 
aaaaaggtgg 
aaatcatgga 
aacgtgaaaa 
atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 
catgagegga 
ttttacgttt 
ggagtttaat 
aaggtcacta 
ttcccctcgg 
atctcgagat 
cccttatgtt 
gggcattcag 
tacaagaaag 
atattegtaa 
gggcaggeca 
tcaataatca 
cgccgtatgt 
actggcagac 
cttacttcca 
cgccgaacac 
aegegtctgt 
eggatcaaca 
cgcacctctg 
agacagagtg 
aacagttcct 



gaaaatgaga 
aaaatttaga 
aaatgagaaa 
atgagaaatg 
agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
taatcactag 
gaattaaggg 
ggaactgaca 
gagctaagca 
tcagctagca 
tatccaatta 
cgaattcccg 
acgtcctgta 
tetggatege 
cegggcaatt 
ttatgtgggc 
gcgtatcgtg 
ggaag t ga t g 
tattgeeggg 
tatcccgccg 
tgatttcttt 
ctgggtggac 
tgactggcag 
ggtggttgca 
geaacegggt 
tgatatctac 
gatcaaccac 



aatacacact 
aatgtccact 
catccacttg 
cacactgaag 
cacaatgaat 
gageggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 
tgattatatc 
agtcacgtta 
gaaccgcaac 
cataegtcag 
aatatttctt 
gagtctcata 
cggccgcgaa 
gaaaccccaa 
gaaaactgtg 
gctgtgccag 
aacgtctggt 
ctgcgtttcg 
gagcatcagg 
aaaagtgtac 
ggaatggtga 
aactacgccg 
gatatcaccg 
gtggtggcca 
actggacaag 
gaaggttatc 
ccgctgcgcg 
aaaccgttct 



60 

120 

180 

240 

300 

360 

420 

480 

54 0 

600 

660 

720 

780 

840 

900 

960 

1020 

108O 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

210O 

2160 
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actttactgg 
tgctgatggt 
cgcattaccc 
ttgatgaaac 
acaagccgaa 
tacaggcgat 
gtattgccaa 
cggaagcaac 
gcgacgctca 
acggtfcggta 
ttctggcctg 
cgttagccgg 
ggctggatat 
ggaatfctcgc 
ggatcttcac 
ctggcatgaa 
ctggcgcacc 
tcgttcaaac 
gattatcata 
gacgttattt 
gatagaaaac 
gttactagat 



ctttggccgt 
gcacgatcac 
ttacgctgaa 
tgcagctgtc 
agaactgtac 
taaagagctg 
cgaaccggat 
gcgtaaactc 
caccgatacc 
tgtccaaagc 
gcaggagaaa 
gctgcactca 
gtatcaccgc 
cga 1 1 1 tgcg 
ccgcgaccgc 
cttcggtgaa 
afccgtcggct 
atttggcaat 
taatttctgt 
afcgagatggg 
aaaatatagc 
cgggaattcg 



catgaagatg 
gcattaatgg 
gagatgctcg 
ggctttaacc 
agcgaagagg 
atagcgcgtg 
acccgtccgc 
gatccgacgc 
atcagcgatc 
ggcgatttgg 
ctgcatcagc 
atgtacaccg 
gtctttgatc 
acctcgcaag 
aaaccgaagt 
aaaccgcagc 
acagcctcgg 
aaagtttctt 
tgaattacgt 
ttttfcatgat 
gcgcaaacta 
atatcaagct 



cggatttgcg 
actggattgg 
actgggcaga 
tctctttagg 
cagt caacgg 
acaaaaacca 
aaggtgcacg 
gtccgatcac 
fcctttgatgt 
aaacggcaga 
cgattatcat 
acatgtggag 
gcgt cagcgc 
gcatattgcg 
cggcggcttt 
agggaggc aa 
gaattgcgta 
aagattgaat 
taagcatgta 
tagagtcccg 
ggataaatta 
t 



cggcaaagga 
ggccaactcc 
tgaacatggc 
cattggtttc 
ggaaactcag 
cccaagcgtg 
ggaatatttc 
ctgcgtcaat 
gctgtgcctg 
gaaggtactg 
caccgaatac 
tgaagagtat 
cgtcgtcggt 
cgttggcggt 
tctgctgcaa 
acaatgaatc 
ccgagctcga 
ccfcgttgccg 
ataattaaca 
caattataca 
tcgcgcgcgg 



ttcgataacg 
taccgtacct 
atcgtggtga 
gaagcgggca 
caggcgcact 
gtgatgtgga 
gcgccactgg 
gtaatgttct 
aaccgttatt 
gaaaaagaac 
ggcgtggata 
cagtgtgcat 
gaacaggtat 
aacaagaagg 
aaacgctgga 
aacaactctc 
atttccccga 
gtcttgcgat 
tgtaatgcat 
tttaatacgc 
fcgtcatctat 



2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3451 



<210> 21 
<211> 14627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAglla Plasmid 



<400> 21 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgcctfcccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgagg t cc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 



a cagggt tec 
eggcttctga 
tacgegacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgea 
acaagcatga 
gaggeggaga 
eggctgeatg 
tfcggccgctg 
agettgegtc 
gggaaegcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tegtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggegctgge 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 



ccfccgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
acccfccaccc 
gtgaaagagg 
egcagegagg 
ttgaccgagg 
aaccgcacca 
tgatcgegge 
aaafccctggc 
aagaaaccga 
atgeggtege 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
egaeggageg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ecaggatget 
tggcccgcag 
gectgegtag 
tgaccgtgtt 
gegggegega 
eggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
egggtaegtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtccegta 
cttgaatcag 
aaa t caaaac 
taagtgccgg 
acacgccagc 
agatgtaege 
agctaccaga 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgttttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gaegaacegt 
ttcgagccgc 
gatgecaage 
ctaaaaaggt 
atgegatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
eggact t ggc 
gcccttacga 
teaeggatgg 
tcggcggtga 
tcacgcagcg 
aacccgaggg 
tcatttgagt 
ccgtccgagc 
catgaagegg 
ggtacgccaa 
gtaaatgagc 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gecatgaaca 
caggacttga 
gagaagatca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gecgagtteg 
gc c egaggeg 
cgcgagctga 
catcgctcga 
aggeggegeg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
egggtcagge 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggee 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 



60 

120 
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atgagfcagat gaattttagc ggctaaagga 
accgacgccg tggaatgccc catgtgtgga 
tgggttgtct gccggccctg caatggcact 
cggtcgcaaa ccatccggcc cggtacaaat 
gaagttgaag gccgcgcagg ccgcccagcg 
fcgaatcgfcgg caagcggccg ctgatcgaat 
cggtgcgccg tcgattagga agccgcccaa 
gatgctctat gacgtgggca cccgcgatag 
tctgtcgaag cgtgaccgac gagctggcga 
cgtagaggtt tccgcagggc cggccggcat 
gatggcggtt tcccatctaa ccgaatccafc 
gcccggccgc gtgttccgtc cacacgttgc 
tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgtacgaaga aggccaagaa 
agccttgatt agccgcfcaca agatcgtaaa 
gatcgagcta gctgattgga tgtaccgcga 
gacggttcac cccgattact ttttgatcga 
ggcacgccgc gccgcaggca aggcagaagc 
cagtggcagc gccggagagt tcaagaagtt 
aaafcgaccfcg ccggagtacg atttgaagga 
catgcgctac cgcaacctga tcgagggcga 
gatgctaggg caaattgccc tagcagggga 
tagcacgtac attgggaacc caaagccgta 
cccaaagccg tacattggga accggtcaca 
aggcgatfctt tccgcctaaa actctttaaa 
ctgtgcataa ctgtctggcc agcgcacagc 
gtcgctgcgc tccctacgcc ccgccgcttc 
aaaaatggct ggccfcacggc caggcaatct 
actcgaccgc cggcgcccac afccaaggcac 
aaaacctctg acacatgcag ctcccggaga 
ggagcagaca agcccgtcag ggcgcgtcag 
tgacccagtc acgtagcgat agcggagtgt 
gattgtactg agagtgcacc atatgcggtg 
ataccgcatc aggcgctctt ccgcttcctc 
gctgcggcga gcggtatcag cfccactcaaa 
ggataacgca ggaaagaaca tgtgagcaaa 
ggccgcgttg ctggcgtttt tccataggct 
acgctcaagt cagaggtggc gaaacccgac 
tggaagctcc ctcgtgcgct ctcctgttcc 
ctttctccct tcgggaagcg tggcgctttc 
ggtgtaggtc gttcgctcca agctgggctg 
ctgcgcctta tccggtaact atcgtcttga 
actggcagca gccactggta acaggattag 
gttcttgaag tggtggccta actacggcta 
tctgctgaag ccagttacct tcggaaaaag 
caccgctggt agcgcytggfcfc tttttgfcfctg 
atctcaagaa gatcctttga tcttttctac 
acgttaaggg attttggtca tgcattctag 
atattttatt ttctcccaat caggcttgat 
ctgttcttcc ccgatatcct ccctgatcga 
gtccgccctg ccgcttctcc caagatcaat 
gatgttgcfcg tctcccaggt cgccgtggga 
ctttaaaaaa tcatacagct cgcgcggatc 
gcaatccaca tcggccagat cgttattcag 
taagctattc gtatagggac aatccgatat 
cgcatacagc tcgataatct tttcagggct 
gacgccatcg gcctcactca tgagcagatt 
gacctttgga acaggcagct ttccttccag 
atcataggtg gtccctttat accggctgtc 
tcccaccagc ttatafcacct tagcaggaga 
tttttcgatc agttttttca attccggtga 
tcctcttttc tacagtattt aaagataccc 
aattcactgt tccttgcatt ctaaaacctt 
ttttcaaagt tggcgtataa catagtatcg 
caggcagcaa cgctcfcgtca tcgttacaat 
gtttcaaacc cggcagctta gttgccgttc 
tctgccgcct tacaacggct ctcccgctga 
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ggcggcatgg aaaatcaaga acaaccaggc 2160 
ggaacgggcg gttggccagg cgtaagcggc 222 0 
ggaaccccca agcccgagga atcggcgtga 22 80 
cggcgcggcg ctgggtgatg acctggtgga 2340 
cjcaacgcatc gaggcagaag cacgccccgg 2400 
ccgcaaagaa tcccggcaac cgccggcagc 24 60 
999cgacgag caaccagatt ttttcgttcc 252 0 
tcgcagcatc atggacgtgg ccgttttccg 2580 
ggtgatccgc tacgagcttc cagacgggca 2640 
ggccagfcgfcg tgggattacg acctggtact 2700 
gaaccgatac cgggaaggga agggagacaa 2760 
ggacgtactc aagttctgcc ggcgagccga 282 0 
aacctgcatt cggttaaaca ccacgcacgt 2880 
cggccgcctg gtgacggtat ccgagggtga 2940 
gagcgaaacc gggcggccgg agtacatcga 300 0 
gatcacagaa ggcaagaacc cggacgtgct 3 0 60 
tcccggcatc ggccgfcttfcc tctaccgcct 3120 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
ggaggcgggg caggctggcc cgatcctagt 33 00 
agcatccgcc ggttcctaat gtacggagca 33 60 
aaaaggtcga aaaggtctct ttcctgtgga 3420 
cattgggaac cggaacccgt acattgggaa 34 80 
catgtaagtg actgatataa aagagaaaaa 3540 
acttattaaa actcttaaaa cccgcctggc 3 60 0 
cgaagagctg caaaaagcgc ctacccttcg 3660 
gcgtcggcct atcgcggccg ctggccgctc 3720 
accagggcgc ggacaagccg cgccgtcgcc 37 80 
cdtgcctcgc gcgtttcggt gatgacggtg 3840 
cggtcacagc ttgtctgtaa gcggatgccg 3 900 
cgggtgttgg cgggfcgfccgg ggcgcagcca 3 9 60 
atactggctt aactatgcgg catcagagca 4020 
tgaaataccg cacagatgcg taaggagaaa 4080 
gctcactgac tcgctgcgct cggtcgttcg 4140 
ggcggtaata cggttafccca cagaatcagg 42 00 
aggccagcaa aaggccagga accgtaaaaa 4260 
ccgcccccct gacgagcatc acaaaaatcg 4320 
aggactataa agataccagg cgtttccccc 43 80 
gaccctgccg cttaccggat acctgtccgc 444 0 
tcatagctca cgctgtaggt atctcagttc 4500 
tgtgcacgaa ccccccgttc agcccgaccg 4 560 
gtccaacccg gtaagacacg acttatcgcc 4 62 0 
cagagcgagg tatgfcaggcgr gtgctacaga 4 680 
cactagaagg acagtatttg gtatctgcgc 4740 
agttggtagc tcttgatccg gcaaacaaac 4800 
caagcagcag attacgcgca gaaaaaaagg 4 860 
ggggtctgac gctcagtgga acgaaaactc 4 920 
gtactaaaac aattcatcca gtaaaatata 4980 
ccccagtaag tcaaaaaata gctcgacata 5040 
ccggacgcag aaggcaatgt cataccactt 5100 
aaagccactt actttgccat ctttcacaaa 5160 
aaagacaagt tcctcttcgg gcttttccgt 5220 
tttaaatgga gtgtcttctt cccagttttc 5280 
taagtaatcc aattcggcta agcggctgtc 5340 
gtcgatggag tgaaagagcc tgatgcactc 54 00 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 5520 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 5640 
cattccttcc gtatctttta cgcagcggta 5700 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 5 820 
aaataccaga aaacagcttt ttcaaagttg 5 8 80 
acggagccga ttttgaaacc gcggtgatca 5940 
caacatgcta ccctccgcga gatcatccgt 6 000 
ttccgaatag catcggtaac atgagcaaag 6060 
cgccgtcccg gactgatggg ctgcctgtat 6120 
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cgagtggtga ttttgtgccg agctgccggt 
tatattgtgg tgtaaacaaa ttgacgcfcta 
taatgtactg aattaacgcc gaattaattc 
gfctttaggaa ttagaaattt tattgataga 
ggtttcttat atgctcaaca catgagcgaa 
ggaactactc acacattatt atggagaaac 
ggacggggcg gtaccggcag gctgaagtcc 
ccgtgcttga agccggccgc ccgcagcatg 
atgcgcacgc tcgggtcgtt gggcagcccg 
gcctccaggg acttcagcag gtgggtgtag 
c 99gg9gaga cgtacacggt cgactcggcc 
gggcccgcgt aggcgatgcc ggcgacctcg 
cgctcccgca gacggacgag gtcgtccgtc 
aagttgaccg tgcttgtctc gatgfcagtgg 
gcctcggtgg cacggcggat gtcggccggg 
gagatagatt tgtagagaga gactggtgat 
ttccttatat agaggaaggt cttgcgaagg 
a 9tgg a g ata tcacatcaat ccacttgctt 
cacgatgctc ctcgtgggtg ggggtccatc 
aacgatagcc tttcctttat cgcaatgatg 
tgtccttttg atgaagtgac agatagctgg 
taccctttgt tgaaaagtct caatagccct 
cttggagfcag acgagagtgt cgtgctccac 
agacgtggtt ggaacgtctt ctttttccac 
gggaccactg tcggcagagg catcttgaac 
tttgtaggtg ccaccttcct tttctactgt 
atggaatccg aggaggtttc ccgatattac 
gtcttctgag actgtatctt tgatattctt 
gttggcaagc tgctctagcc aatacgcaaa 
taatgcagct ggcacgacag gtttcccgac 
aatgtgagfct agctcactca ttaggcaccc 
atgttgtgtg gaattgtgag cggataacaa 
tacgaattcg agccttgact agagggtcga 
gagttfcggac aaaccacaac tagaatgcag 
gatgctattg cfctfcatttgt aaccattata 
gaactccagc atgagatccc cgcgctggag 
tccgaagccc aacctttcat agaaggcggc 
gtcctgctcc tcggccacga agtgcacgca 
ccgcccccac ggctgctcgc cgatctcggt 
cgtggacacg acctccgacc actcggcgta 
ggccagggtg ttgtccggca ccacctggtc 
gtcccggacc acaccggcga agtcgtcctc 
ggtccagaac tcgaccgctc cggcgacgtc 
caacttggcc atggatccag atttcgctca 
gcaggaattc gatcgacact ctcgtctact 
accaaagggc tattgagact tttcaacaaa 
attgcccagc tatctgtcac ttcatcaaaa 
aatgccatca ttgcgataaa ggaaaggcta 
ccaaagatgg acccccaccc acgaggagca 
cttcaaagca agtggatfcga tgtgataaca 
agaatatcaa agatacagtc tcagaagacc 
taatatcggg aaacctcctc ggattccatt 
cagtagaaaa ggaaggtggc acctacaaat 
ttcaagatgc ctctgccgac agtggtccca 
tggaaaaaga agacgttcca accacgtctt 
ctgacgtaag ggatgacgca caatcccact 
aagttcattt catttggaga ggacacgctg 
tctctcgagc tttcgcagat ccgggggggc 
cgacgtctgt cgagaagttt ctgatcgaaa 
tctcggaggg cgaagaatct cgtgctttca 
fcgcgggtaaa tagctgcgcc gatggtttct 
catcggccgc gctcccgatt ccggaagtgc 
cctattgcat ctcccgccgt gcacagggtg 
tgcccgctgt tctacaaccg gtcgcggagg 
gccagacgag cgggttcggc ccattcggac 
gtgatttcat atgcgcgatt gctgatcccc 
acaccgtcag tgcgtccgtc gcgcaggctc 
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cggggagctg ttggctggct ggtggcagga 6180 
gacaacttaa taacacattg cggacgtttt 624 0 
gggggatctg gafctttagta ctggafcttfcg 6300 
agtattttac aaatacaaat acatactaag 6360 
accctatagg aaccctaatt cccttatctg 642 0 
tcgagtcaaa tctcggtgac gggcaggacc 648 0 
agctgccaga aacccacgtc atgccagttc 654 0 
ccgcgggggg catatccgag cgcctcgtgc 66O0 
atgacagcga ccacgctctt gaagccctgt 6660 
agcgtggagc ccagtcccgt ccgctggtgg 672 0 
gfcccagtcgt aggcgttgcg tgccttccag 6780 
ccgtccacct cggcgacgag ccagggatag 6840 
cactcctgcg gttcctgcgg cfccggfcacgg 690 0 
ttgacgatgg tgcagaccgc cggcatgtcc 6960 
cgtcgtfccfcg ggctcatggt agactcgaga 702 0 
ttcagcgtgt cctctccaaa tgaaatgaac 7080 
atagtgggat tgtgcgtcat cccttacgtc 714 0 
fcgaagacgtg gfctggaacgt cttctttttc 7200 
tttgggacca ctgtcggcag aggcatcttg 726 0 
gcatttgtag gtgccacctt ccttttctac 7320 
gcaatggaat ccgaggaggt ttcccgatat 73 80 
ttggtcttct gagactgtat ctttgatatt 7440 
catgttatca catcaatcca cttgctttga 7500 
gatgctcctc gtgggtgggg gtccatcttt 7560 
gatagccttt cctttatcgc aatgatggca 7620 
ccttttgatg aagtgacaga tagctgggca 7680 
cctttgttga aaagtctcaa tagccctttg 7740 
ggagtagacg agagtgtcgt gctccaccat 7 800 
ccgcctctcc ccgcgcgttg gccgattcat 7860 
tggaaagcgg gcagtgagcg caacgcaatt 7 920 
caggctttac actttatgct tccggctcgt 7 980 
tttcacacag gaaacagcta tgaccatgat 8 04 0 
cggtatacag acatgataag atacattgat 81O0 
tgaaaaaaat gctttatttg tgaaatttgt 8160 
agctgcaata aacaagttgg ggtgggcgaa 822 0 
gatcatccag ccggcgtccc ggaaaacgat 82 8 0 
ggtggaatcg aaatctcgta gcacgtgtca 8 34 0 
gttgccggcc gggtcgcgca gggcgaactc 840 0 
catggccggc ccggaggcgt cccggaagtt 84 60 
cagctcgtcc aggccgcgca cccacaccca 8520 
ctggaccgcg ctgatgaaca gggtcacgtc 85 8 0 
cacgaagtcc cgggagaacc cgagccggtc 8 64 0 
gcgcgcggtg agcaccggaa cggcactggt 8700 
agttagtata aaaaagcagg cttcaatcct 8760 
ccaagaatat caaagataca gtctcagaag 8 82 0 
gggtaatatc gggaaacctc ctcggattcc 8 88 0 
ggacagtaga aaaggaaggt ggcacctaca 8 94 0 
fccgttcaaga tgcctctgcc gacagtggtc 9000 
tcgtggaaaa agaagacgtt ccaaccacgt 9060 
tggtggagca cgacactctc gtctactcca 9120 
aaagggctat tgagactttt caacaaaggg 918 0 
gcccagctat ctgtcacttc atcaaaagga 924 0 
gccatcattg cgataaagga aaggctatcg 930 0 
aagatggacc cccacccacg aggagcatcg 9360 
caaagcaagt ggattgatgt gatatctcca 94 2 0 
atccttcgca agaccttcct ctatataagg 9480 
aaatcaccag tctctctcta caaatctatc 954 0 
aatgagatat gaaaaagcct gaactcaccg 9600 
agttcgacag cgtctccgac ctgafcgcagc 966 0 
gcttcgatgt aggagggcgt ggatatgtcc 972 0 
acaaagatcg ttatgtttat cggcactttg 9780 
ttgacattgg ggagtttagc gagagcctga 984 0 
tcacgttgca agacctgcct gaaaccgaac 990 0 
ctatggatgc gatcgctgcg gccgatctta 996 0 
cgcaaggaat cggtcaatac actacatggc 1002 0 
atgtgtatca ctggcaaact gtgatggacg 10080 
tcgatgagct gatgcfcttgg gccgaggact 1014 0 
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gccccgaagt ccggcacctc gtgcacgcgg atttcggctc caacaafcgtc ctgacggaca 10200 
atggccgcat aacagcggtc attgactgga gcgaggcgat gttcggggat tcccaatacg 102 60 
aggtcgccaa catcttcttc fcggaggccgt ggttggcttg tatggagcag cagacgcgct 1032 0 
acttcgagcg gaggcatccg gagcttgcag gatcgccacg actccgggcg tatatgctcc 103 80 
gcattggfccfc tgaccaactc tatcagagct fcggttgacgg caatttcgat gatgcagctt 1044 0 
gggcgcaggg tcgatgcgac gcaatcgtcc gatccggagc cgggactgtc gggcgtacac 1050 0 
aaatcgcccg cagaagcgcg gccgtctgga ccgatggctg tgtagaagta ctcgccgata 10560 
gtggaaaccg acgccccagc actcgtccga gggcaaagaa atagagtaga tgccgaccgg 10 62 0 
atctgtcgat cgacaagctc gagtttctcc ataataatgt gtgagtagtt cccagataag 10 6 80 
ggaafcfcaggg ttcctatagg gtttcgctca tgtgttgagc atataagaaa cccttagtat 1074 0 
gtatttgtat ttgtaaaata cttctatcaa taaaatttct aattcctaaa accaaaatcc 108 0 0 
agtactaaaa tccagatccc ccgaattaat tcggcgttaa ttcagatcaa gcttgacctg 10860 
gaatatcgcg agtaaactga aaatcacgga aaatgagaaa tacacacttt aggacgtgaa 10920 
atatggcgag gaaaactgaa aaaggtggaa aatttagaaa tgtccactgt aggacgtgga 10 980 
atatggcaag aaaactgaaa atcatggaaa atgagaaaca tccacttgac gacttgaaaa 11040 
atgacgaaat cactaaaaaa cgtgaaaaat gagaaatgca cactgaagga ctccgcggga 1110 0 
attcgattgt gctagccaat gtttaacaag atgtcaagca caatgaatgt tggtggttgg 11160 
tggtcgtggc tggcggtggt ggaaaattgc ggtggttcga gcggtagtga fccggcgatgg 11220 
fcfcggtgtfctg cagcggtgtt tgatatcgga afccactfcafcg gtggttgtca caatggaggt 112 80 
gcgtcatggt tattggtggt tggtcatcta tatattttta taataatatt aagtatttta 11340 
cctatttttt acatattttt tattaaattt atgcafcfcgtt tgtattttta aatagttttt 11400 
atcgtacttg ttttataaaa tattttatta ttttatgtgt tatattatta cttgatgtat 11460 
tggaaatttt ctccattgtt ttttctatat ttataataat tttcttattt ttttttgttt 11520 
tattatgtat tttttcgttt tataataaat afcfcfcattaaa aaaaatatta tttttgtaaa 11580 
atatatcatt tacaatgttt aaaagtcafct tgtgaatata ttagctaagt tgtacttctt 1164 0 
tttgtgcatt tggtgttgta catgtctatt atgattctct ggccaaaaca tgtctactcc 117 00 
fcgtcactfcgg gtttfcttttt ttaagacata atcactagtg attatatcta gactgaaggc 11760 
gggaaacgac aatctgatca tgagcggaga attaagggag tcacgttatg acccccgccg 11820 
atgacgcggg acaagccgtt ttacgtttgg aactgacaga accgcaacgt tgaaggagcc 11880 
actcagccgc gggfcttctgg agtttaatga gctaagcaca tacgtcagaa accattattg 1194 0 
cgcgttcaaa agtcgcctaa ggtcactatc agctagcaaa tatttcttgt caaaaatgct 12 0 00 
ccactgacgt tccataaatt cccctcggta tccaattaga gtctcatatt cactctcaat 12 060 
ccaaataatc tgcaccggat ctcgagatcg aattcccgcg gccgcgaatt cactagtgga 1212 0 
tccccgggta cggfccagtcc cttatgttac gtcctgtaga aaccccaacc cgtgaaatca 12180 
aaaaactcga cggcctgtgg gcattcagtc tggatcgcga aaactgtgga attgagcagc 12240 
gttggtggga aagcgcgtta caagaaagcc gggcaattgc tgtgccaggc agttttaacg 123 0 0 
atcagttcgc cgatgcagat attcgtaatt atgtgggcaa cgtctggtat cagcgcgaag 123 60 
tctttatacc gaaaggttgg gcaggccagc gtatcgtgct gcgtttcgat gcggtcactc 1242 0 
attacggcaa agtgtgggtc aataatcagg aagtgatgga gcatcagggc ggctatacgc 124 80 
catttgaagc cgatgtcacg ccgtatgtta ttgccgggaa aagtgtacgt atcacagttt 12540 
gtgtgaacaa cgaactgaac tggcagacta tcccgccggg aatggtgatt accgacgaaa 12 6 00 
acggcaagaa aaagcagtct tacttccatg atttctttaa ctacgccggg atccatcgca 12 660 
gcgtaatgct ctacaccacg ccgaacacct gggtggacga tatcaccgtg gtgacgcatg 12 720 
tcgcgcaaga ctgtaaccac gcgtctgttg actggcaggt ggtggccaat ggtgatgtca 12 780 
gcgttgaact gcgtgatgcg gatcaacagg tggttgcaac tggacaaggc accagcggga 12 840 
ctttgcaagt ggtgaatccg cacctctggc aaccgggtga aggttatctc fcatgaactgt 12 90 0 
acgtcacagc caaaagccag acagagtgtg atatctaccc gctgcgcgtc ggcafcccggt 12 960 
cagtggcagt gaagggcgaa cagttcctga tcaaccacaa accgttctac fcttactggct 13 02 0 
ttggccgtca tgaagatgcg gatttgcgcg gcaaaggatt cgataacgtg ctgatggtgc 13 080 
acgatcacgc attaatggac tggafctgggg ccaactccta ccgtacctcg cattaccctt 1314 0 
acgctgaaga gatgctcgac tgggcagatg aacatggcat cgfcggtgatt gatgaaactg 13200 
cagctgtcgg ctttaacctc fcctttaggca ttggtttcga agcgggcaac aagccgaaag 132 60 
aactgtacag cgaagaggca gtcaacgggg aaactcagca ggcgcactta caggcgatta 1332 0 
aagagctgat agcgcgtgac aaaaaccacc caagcgtggt gafcgtggagt attgccaacg 133 80 
aaccggatac ccgtccgcaa 9gtgcacggg aatatttcgc gccactggcg gaagcaacgc 13440 
gtaaactcga tccgacgcgt ccgatcacct gcgtcaatgt aatgttctgc gacgctcaca 135 00 
ccgataccat cagcgatctc tttgatgtgc tgtgcctgaa ccgttattac ggttggtatg 135 60 
tccaaagcgg cgatttggaa acggcagaga aggtactgga aaaagaactt ctggcctggc 13 620 
aggagaaact gcatcagccg attatcatca ccgaatacgg cgtggatacg ttagccgggc 13 680 
tgcactcaat gtacaccgac atgtggagtg aagagtatca gtgtgcatgg ctggatatgt 13 74 0 
atcaccgcgt ctttgatcgc gtcagcgccg tcgtcggtga acaggtatgg aatttcgccg 13 800 
attttgcgac ctcgcaaggc atattgcgcg ttggcggtaa caagaagggg atcttcaccc 13 8 60 
gcgaccgcaa accgaagtcg gcggcttttc tgctgcaaaa acgctggact ggcatgaact 13920 
tcggtgaaaa accgcagcag ggaggcaaac aatgaatcaa caactctcct ggcgcaccat 13 980 
cgtcggctac agcctcggga attgcgtacc gagctcgaat ttccccgatc gttcaaacat 14 040 
ttggcaataa agtttcttaa gattgaatcc tgttgccggt cttgcgatga ttatcatata 14100 
atttctgttg aattacgtta agcatgtaat aattaacatg taatgcatga cgttatttat 1416 0 
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gagatgggtt 
aatatagcgc 
ggaattcgat 
ctggcgttac 
gcgaagaggc 
agagcagctt 
ttgacaggat 
atttaaaagg 



ttfcatgatta 
gcaaactagg 
atcaagcfctg 
ccaacttaat 
ccgcaccgat 
gagcttggat 
atattggcgg 
gcgtgaaaag 



gagtcccgca 
ataaattatc 
gcactggccg 
cgccttgcag 
cgcccttccc 
cagattgtcg 
gtaaacctaa 
gtttatccgt 



attatacafct 
gcgcgcggtg 
tcgttttaca 
cacatccccc 
aacagttgcg 
tttcccgcct 
gagaaaagag 
tcgtccatfct 



taatacgcga 
tcatctatgt 
acgtcgtgac 
tttcgccagc 
cagcctgaat 
tcagtttaaa 
cgtttattag 
gtafcgfcg 



tagaaaacaa 
tactagatcg 
tgggaaaacc 
tggcgtaata 
ggcgaatgct 
ctatcagtgt 
aataacggat 



14220 
14280 
14340 
14400 
14460 
14520 
14580 
14627 



<210> 22 
<211> 4257 
<212> DNA 

<213> Axtificial Sequence 
<220> 

<223> pPUR Plasmid 



<400> 22 

ctgtggaatg 

atgcaaagca 

gcaggcagaa 

actccgccca 

ctaatttttt 

tagtgaggag 

ggccgccacg 

gacgaccfc fcc 

cccgggccgt 

tcgacccgga 

tcgggctcga 

ccacgccgga 

agttgagcgg 

ggcccaagga 

agggtctggg 

ccgccttcct 

ccgtcaccgc 

ccgg t gc c t g 

atggctccga 

caccgactct 

aaaaacctcc 

aacttgttta 

aataaagcat 

tatcatgtct 

ttgagaggac 

gtcacttaac 

tttaaaatat 

acaaatgtca 

ctcatcaaga 

cccacctgtg 

gcactccact 

ctgactgtca 

gtttgctaac 

tgacccttga 

gtttaacata 

aatatttcca 

ggcctcgtga 

t c agg t ggca 

cattcaaata 

aaaaggaaga 

ttttgccttc 

cagttgggtg 

agttttcgcc 

gcggtattat 

cagaatgact 

gt aagagaat 

ctgacaacga 

gtaactcgcc 

gacaccacga 



tgfcgtcagtt 
tgcatctcaa 
gtatgcaaag 
tcccgcccct 
ttatttatgc 
gcfcttfctfcgg 
accggtgccg 
catgaccgag 
acgcaccctc 
ccgccacatc 
catcggcaag 
gagcgtcgaa 
ttcccggctg 
gcccgcgtgg 
cagcgccgtc 
ggagacctcc 
cgacgtcgag 
acgcccgccc 
ccgaagccga 
agaggatcat 
cacacctccc 
ttgcagctta 
ttttttcact 
ggatccccag 
attccaatca 
aaaaaggaaa 
ctgggaagtc 
acagcagaaa 
agcactgtgg 
taggttccaa 
ggataagcat 
actgtagcat 
acaccctgca 

atgggttttc 

gcagttaccc 
caggttaagt 
tacgcctatt 
cttttcgggg 
tgtafcccgcfc 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
t cggaggacc 
tfcgatcrgfctg 
tgcctgcagc 



agggtgtgga 
ttagtcagca 
catgcatctc 
aactccgccc 
agaggccgag 
aggcctaggc 
ccaccatccc 
t acaagccca 
gccgccgcgt 
gagcgggtca 
gtgtgggtcg 
gcgggggcgg 
gccgcgcagc 
ttcctggcca 
gtgctccccg 
gcgccccgca 
gtgcccgaag 
cacgacccgc 
ccegggcggc 
aatcagccat 
ccfcgaacctg 
taatggttac 
gcattctagt 
gaagctcctc 
taggctgccc 
ttgggtaggg 
ccttccactg 
catacaagct 
ttgctgtgtt 
aatatctagt 
tatccttatc 
tttttggggt 
gctccaaagg 
cagcaccatt 
caafcaaccfcc 
cctcatttaa 
tttataggtt 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttttccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagc t. a 
ggaaccggag 
aatggcaaca 



aagtccccag 
accaggtgtg 
aattagtcag 
agttccgccc 
gccgccfccgg 
ttttgcaaaa 
ctgacccacg 

cggtgcgcct 

tcgccgacta 
ccgagctgca 
cggacgacgg 
tgttcgccga 
aacagatgga 
ccgtcggcgt 
gagtggaggc 
acctcccctt 
gaccgcgcac 
agcgcccgac 
cccgccgacc 
accacattfcg 
aaacataaaa 
aaataaagca 
tgtggtttgt 
tgtgtcctca 
atccaccctc 
gtttttcaca 
ctgtgttcca 
gtcagctttg 
agtaatgtgc 
gttttcattt 
caaaacagcc 
tacagtttga 
ttccccacca 
ttcatgagtt 
agttttaaca 
attaggcaaa 
aatgtcatga 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gage aact eg 
acagaaaagc 
atgagfcgata 
acegcttttt 
ctgaatgaag 
aegttgegea 



gctccccagc 
gaaagtcccc 
caaccafcagt 
attctccgcc 
cctctgagct 
agettgeatg 
cccctgaccc 
cgccacccgc 
ccccgccacg 
agaactcttc 
cgccgcggtg 
gatcggcccg 
aggcctcctg 
ctcgcccgac 
ggccgagcgc 
etacgagegg 
ctggtgcatg 
cgaaaggagc 
ccgcacccgc 
tagaggtfctt 
tgaatgcaat 
atagcatcac 
ccaaactcat 
taaaccctaa 
tgtgtcctcc 
gaccgctttc 
gaagtgttgg 
cacaagggcc 
aaaacaggag 
ttacttggat 
ttgfcggtcag 
gcaggatatt 
acagcaaaaa 
ttttgtgtcc 
gtaacagctt 
ggaattcttg 
taataatggt 
tttgtfctatt 
aaatgettea 
ttattccctt 
aagtaaaaga 
acageggtaa 
ttaaagttct 
gtcgccgcat 
atettaegga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 



aggcagaagt 
aggctcccca 
cccgccccta 
ccatggctga 
attccagaag 
cctgcaggtc 
ctcacaagga 
gacgacgtcc 
cgccacaccg 
ctcacgcgcg 
gcggtctgga 
cgcatggccg 
gcgccgcacc 
caccagggca 
gccggggtgc 
ctcggcttca 
acccgcaagc 
gcacgacccc 
ccccgaggcc 
acttgettta 
tgttgttgfct 
aaatttcaca 
caatgtatct 
cctcctctac 
tgttaattag 
fcaagggtaat 
taaacagccc 
caacaccctg 
gcacattttc 
caggaaccca 
tgtfccatctg 
tggtcctgta 
aatgaaaatt 
ctgaatgcaa 
cccacatcaa 
aagacgaaag 
ttcttagacg 
tttctaaata 
ataatattga 
ttttgeggea 
tgctgaagat 
gatccttgag 
gctatgtggc 
acactattct 
tggcatgaca 
caacttactt 

gggggatcat 

egacgagegt 
tggegaacta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2680 

2940 
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cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagat aggtg 
ctttagattg 
gataatctca 
gtagaaaaga 
caaacaaaaa 
ctttttccga 
tagccgtagt 
ctaatcctgfc 
tcaagacgat 
cagcccagct 
gaaagcgcca 
ggaacaggag 
gtcgggtttc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
caccgcatat 



cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaaaact 
tgaccaaaat 
tcaaaggatc 
aaccaccgct 
aggtaactgg 
taggccacca 
taccagtggc 
agttaccgga 
tggagcgaac 
cgcttcccga 
agcgcacgag 
gccacctctg 
aaaacgccag 
tgttctttcc 
ctgataccgc 
aagagcgcct 
ggtgcactct 



acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
gatgcggtat 
cagtacaatc 



gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgat aagt 
cggt cgggct 
gaactgagat 
gcggacaggt 
gggggaaacg 
cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
tttctcctta 
tgctctgatg 



aggcggataa 
ctgataaatc 
abggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
tggataaccg 
agcgcagcga 
cgcatctgtg 
ccgcatagtt 



agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 

cgggttggac 

ttcgtgcaca 
tgagctatga 
cggcagggtc 
ttatagtcct 
aggggggcgg 
ttgctggcct 
tattaccgcc 
gtcagtgagc 
cggtatttca 
aagccag 



3000 
3O60 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4257 



<210> 23 

<211> 2713 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pNEB193 Plaemid 



<400> 23 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgcg 

attcgccatt 

tacgccagct 

tttcccagtc 

gcgccggatc 

gcgtaatcat 

aacatacgag 

acattaattg 

cattaatgaa 

tcctcgctca 

tcaaaggcgg 

gcaaaaggcc 

aggctccgcc 

ccgacaggac 

gttccgaccc 

ctttctcata 

ggctgtgtgc 

cttgagtcca 

attagcagag 

ggctacacta 

aaaagagt tg 

gtttgcaagc 

tctacggggt 

ttatcaaaaa 

taaagtatat 

atctcagcga 

actacgatac 

cgctcaccgg 

agtggtcctg 

gtaagtagtt 

gtgtcacgct 

gttacatgat 



cggtgatgac 
gtaagcggat 
tcggggctgg 
gtgtgaaata 
caggc t gcgc 
ggcgaaaggg 
acgacgttgt 
cttaattaag 
ggtcatagct 
ccggaagcat 
cgttgcgctc 
tcggccaacg 
ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 
tataaagata 
tgccgcttac 
gctcacgctg 
acgaaccccc 
acccggtaag 
cgaggtatgt 
gaaggacagt 
gtagctcttg 
agcagattac 
ctgacgctca 
ggatcttcac 
atgagtaaac 
tctgtctatt 
gggagggctt 
ctccagattt 
caactttatc 
cgccagttaa 
cgtcgtttgg 
cccccatgtt 



ggtgaaaacc 
gc cgggagc a 
cttaactatg 
ccgcacagat 
aact gt tggg 
ggatgtgctg 
aaaacgacgg 
tctagagtcg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgcggggaga 
gcgc t cggt c 
atccacagaa 
caggaaccgt 
gcatcacaaa 
ccaggcgttt 
cggatacctg 
taggtatctc 
cgttcagccc 
acacgactta 
aggcggtgct 
atttggtatc 
atccggcaaa 
gcgcagaaaa 
gtggaacgaa 
ctagatcctt 
ttggtctgac 
tcgttcatcc 
accatctggc 
atcagcaata 
cgcctccatc 
tagtttgcgc 
tatggcttca 
gtgcaaaaaa 



tctgacacat 
gacaagcccg 
cggcatcaga 
gcgtaaggag 
aagggcga t c 
caaggcgatt 
ccagtgaatt 
actgtttaaa 
tgaaattgtt 
gcctggggtg 
ttccagtcgg 
ggcggtttgc 
gttcggctgc 
tcaggggata 
aaaaaggc eg 
aatcgacget 
ccccctggaa 
tccgcctttc 
agttcggtgt 
gaccgctgcg 
tcgccactgg 
acagagttct 
tgcgctctgc 
caaaccaccg 
aaaggatctc 
aactcaegtt 
ttaaattaaa 
agttaccaat 
atagttgect 
cccagtgctg 
aaccagccag 
cagtctatta 
aacgttgttg 
ttcagctccg 
gcggttagct 



gcagctcccg 
teagggegeg 
gcagattgta 
aaaatacege 
ggtgcgggcc 
aagttgggta 
egagcteggt 
cctgcaggca 
atccgctcac 
cctaatgagt 
gaaacctgtc 
gtattgggcg 
ggcgagcggt 
aegcaggaaa 
cgttgctggc 
caagtcagag 
gctccctcgt 
tcccttcggg 
aggtegtteg 
ccttatccgg 
cagcagccac 
tgaagtggtg 
tgaagccagt 

ctggtagcgg 

aagaagatcc 
aagggatttt 
aatgaagttt 
gcttaatcag 
gactccccgt 
caatgatacc 
ceggaaggge 
attgttgccg 
ccattgctac 
gttcccaacg 
ccttcggtcc 



gagaeggtea 
teagegggtg 
ctgagagtgc 
atcaggcgcc 
tettegctafc 
aegecagggt 
acccgggggc 
tgcaagcttg 
aattccacac 
gagctaactc 
gtgccagctg 
ctcttccgct 
atcagctcac 
gaacatgtga 
gtttttccat 
gtggcgaaac 
gcgctctcct 
aagcgtggcg 
ctccaagctg 
taactatcgt 
tggtaacagg 
gcctaactac 
taccttegga 
tggttttttt 
tttgatcttt 
ggt cat gaga 
taaatcaatc 
tgaggcacct 
cgtgtagata 
gcgagaccca 
egagegcaga 
ggaagctaga 
aggcategtg 
ateaaggega 
tecgategtt 



60 

120 

180 

240 

3O0 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 
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gtcagaagta 
cttactgtca 
ttctgagaat 
accgcgccac 
aaactcfccaa 
aactgatctt 
caaaatgccg 
ctttttcaat 
gaatgtattt 
cctgacgtct 
aggccctttc 



agttggccgc 
tgccatccgt 
agtgtatgcg 
atagcagaac 
ggatcttacc 
cagcatcttt 
caaaaaaggg 
attattgaag 
agaaaaataa 
aagaaaccat 
gtc 



agtgttatca 
aagatgcttt 
gcgaccgagt 
tttaaaagtg 
gctgttgaga 
tactttcacc 
aataagggcg 
catttatcag 
acaaat aggg 
tattatcatg 



-30- 

ctcatggtta 
tctgtgactg 
tgctcttgcc 
ctcatcattg 
tccagttcga 
agcgtttctg 
acacggaaat 
ggttattgtc 
gttccgcgca 
acattaacct 



tggcagcact 
gtgagtactc 
cggcgtcaat 
gaaaacgttc 
tgtaacccac 
ggtgagcaaa 
gttgaatact 
tcatgagcgg 
catttccccg 
ataaaaatag 



gcataattct 
aaccaagtca 
acgggataat 

ttcggggcga 
tcgtgcaccc 
aacaggaagg 
catactcttc 
atacatattt 
aaaagtgcca 
gcgtatcacg 



<210> 24 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> attPUP Primer 
<400> 24 

ccttgcgcta atgctctgtt acagg 

<210> 25 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPDWN Primer 
<400> 25 

cagaggcagg gagtgggaca aaattg 

<210> 26 

<211> 4346 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pSV40193attPsensePUR Plasmid 



<400> 26 

ccggtgccgc 

atgaccgagt 

cgcaccctcg 

cgccacatcg 

atcggcaagg 

agcgtcgaag 

tcccggctgg 

cccgcgtggt 

agcgccgtcg 

gagacctccg 

gacgtcgagg 

cgcccgcccc 

cgaagccgac 

gaggatcata 

acacctcccc 

tgcagcttat 

tttttcactg 

gatccgcgcc 

gcttggcgta 

cacacaacat 

aactcacatt 

agctgcatta 

ccgcttcctc 

ctcactcaaa 



caccatcccc 
acaagcccac 
ccgccgcgtt 
agcgggtcac 
tgtgggtcgc 
cgggggcggt 
ccgcgcagca 
tcctggccac 
tgctccccgg 
cgccccgcaa 
tgcccgaagg 
acgacccgca 
ccgggcggcc 
atcagccata 
ctgaacctga 
aatggttaca 
cattctagtt 
ggatccttaa 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 



tgacccacgc 
ggtgcgcctc 
cgccgactac 
cgagctgcaa 
ggacgacggc 
gttcgccgag 
acagatggaa 
cgtcggcgtc 
agtggaggcg 
cctccccttc 
accgcgcacc 
gcgcccgacc 
ccgccgaccc 
ccacatttgt 
aacataaaat 
aataaagcaa 
gtggtttgtc 
ttaagtctag 
tagctgtttc 
agcataaagt 
cgctcactgc 
caacgcgcgg 
tcgctgcgct 
cggttatcca 



ccctgacccc 
gccacccgcg 
cccgccacgc 
gaactcttcc 
gccgcggtgg 
atcggcccgc 
ggcctcctgg 
tcgcccgacc 
gccgagcgcg 
tacgagcggc 
tggtgcatga 
gaaaggagcg 
cgcacccgcc 
agaggtttta 
gaatgcaatt 
tagcatcaca 
caaactcatc 
agtcgactgt 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 



tcacaaggag 
acgacgtccc 
gccacaccgt 
t cacgcgcg t 
cggtctggac 
gcatggccga 
cgccgcaccg 
accagggcaa 
ccggggtgcc 
tcggcttcac 
cccgcaagc c 
cacgacccca 
cccgaggccc 
cttgctttaa 
gttgttgtta 
aatttcacaa 
aatgtatctt 
ttaaacctgc 
ttgttatccg 
gggtgcctaa 
gtcgggaaac 
tttgcgcatt 
gctgcggcga 
ggataacgca 



acgaccttcc 
ccgggccgta 
cgacccggac 
cgggc t cgac 
cacgccggag 
gttgagcggt 
gcccaaggag 
gggtctgggc 
cgccttcctg 
cgtcaccgcc 
cggtgcctga 
tggctccgac 
accgactcta 
aaaacctccc 
acttgtttat 
ataaagcatt 
atcatgtctg 
aggcatgcaa 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctfc 
gcggtatcag 
ggaaagaaca 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2713 



25 



26 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 
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tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
tcacgaggcc 
agctcccgga 
agggcgcgtc 
agattgtact 
aataccgcat 

tgcgggcctc 

gttgggtaac 
agctgtggaa 
gtatgcaaag 
cagcaggcag 
taactccgcc 
gactaatttt 
agtagtgagg 
tcactaatac 
tatgtagtct 
gtttctcgtt 
tgttgcaacg 
cccactccct 



aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
c aag c age ag 
ggggtctgac 
aaaaaggat c 
tatatatgag 
agegatctgt 
gataegggag 
accggctcca 
tcctgcaact 
tagttcgeca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgeca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
cgtctaagaa 
ctttcgtctc 
gaeggtcaca 
agcgggtgtt 
gagagtgeae 
caggcgccat 
ttegctatta 
gccagggttt 
tgtgtgtcag 
catgcatctc 
aagtatgcaa 
catcccgccc 
ttttatttat 
aggctttttt 
catctaagta 
gttttttatg 
cagctttttt 
aacaggtcac 
gcctctgggg 



aaggccagga 
gacgagcatc 
agataccagg 
ettaceggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgegea 
gc t cagtgga 
ttcacctaga 
taaacttggt 
etatttegtt 
ggcttaccat 
gatttatcag 
ttatccgcct 
gttaatagtt 
tttggtatgg 
atgttgtgca 
gccgcagtgt 
teegtaagat 
atgeggegae 
agaactttaa 
ttaccgctgt 
tcttttactt 
aagggaataa 
tgaagcattt 
aataaacaaa 
accattatta 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
catatgeggt 
tcgccattca 
cgccagctgg 
tcccagtcac 
ttagggtgtg 
aattagtcag 
ageatgeate 
ctaactccgc 
geagaggecg 
ggaggctegg 
gttgattcat 
caaaatctaa 
atactaagtt 
tatcagtcaa 
ggcgcg 



acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
egagttgetc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
tcatgacatt 
gtgatgaegg 
aagcggatgc 
ggggctggct 
gtgaaatacc 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
gaaagtcccc 
caaccaggtg 
tcaattagtc 
ccagttccgc 
aggccgcctc 
tacccccttg 
agtgactgca 
tttaatatat 
ggcattataa 
aataaaatca 



ggccgcgttg 
aegctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
ac tggcagca 
gttcttgaag 
tetgetgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ecaatgetta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
aacctataaa 
tgaaaacctc 
egggagcaga 
taactatgcg 
geacagatge 
c t gt t gggaa 
atgtgctgca 
aacgacggcc 
aggctcccca 
tggaaagtcc 
agcaaccata 
ccattctccg 
ggectctgag 
egctaatget 
tatgttgtgt 
tgatatttat 
aaaagcattg 
ttatttgatt 



ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tegggaageg 
gttcgctcca 
teeggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgegag 
agggecgage 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
geactgeata 
tactcaacca 
teaataeggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
ageggataca 
ccccgaaaag 
aataggcgta 
tgacacatgc 
caagcccgtc 
gcatcagagc 
gt aaggagaa 
gggegategg 
aggegattaa 
agtgaattcg 
gcaggcagaa 
ccaggctccc 
gtcccgcccc 
ccccatggct 
ctattccaga 
ctgttacagg 
tttacagtat 
atcattttac 
cttatcaatt 
tcaattttgt 



1500 
1560 
1620 
1680 
1740 
18O0 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4346 



<210> 27 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlntR Plasmid 



<400> 27 

gtcgacattg 

geccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
teatatgeca 
tgcccagtac 
cgctattacc 
cctccccacc 



agttattaat 
gttacataac 
aegtcaataa 

tgggtggact 

agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttaeggta 
etattgaegt 
gggactttcc 
gtgagcccca 
tatttattta 



taeggggtea 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgaeggt 
tacttggcag 
cgttctgctt 
ttttttaatt 



ttagttcata 60 
ggctgaccgc 12 0 
aegecaatag 180 
ttggcagtac 240 
aaatggcccg 3 0 0 
tacatctacg 360 
cactctcccc 42 0 
attttgtgca 48 0 
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gcgatggggg 
gggcggggcg 
tccttttatg 
gggagtcgct 
c cggc t c tga 

gggctgtaat 
ccfctaaaggg 
tgtgtgtgtg 
cgggcgcggc 
ggtgccccgc 

tgggggggtg 

cctccccgag 

gcggggctcg 

ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
cgtgcgfccgc 
acggctgcct 
gctctagagc 
acgtgctggt 
gtcatgagcg 
acagggaccc 
ctgaagctat 
cgagaa t caa 
tcctggccag 
caataaggag 
caatgctcaa 
cactgagcga 
ctgccactcg 
tgaaaattta 
ctgttgttac 
atggatatct 
tgcatattga 
t tggcggaga 
caaggtattt 
cctttcacga 
ttgctcaaca 
gaggcaggga 
cctatcagaa 
tttttccctc 
gctaataaag 
tcggaaggac 
gtttggcaac 
cagtatatga 
ggttagattfc 
tccttacatg 
gtccctcttc 
atagctgttt 
aagcataaag 
gcgctcactg 
tagtcagcaa 
tccgcccatt 
gcctcggcct 
tgcaaaaagc 
caaatttcac 
tcaatgtatc 
aggcggtttg 
cgttcggctg 
atcaggggat 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 



cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 
tagcgcfctgg 
ctccgggagg 
cgtggggagc 
gcggggctfct 
ggtgcggggg 
agcagggggt 
ttgctgagca 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gtcccaaatc 
gggcgaagcg 
cgcgccgccg 
tcggggggga 
ctctgctaac 
tgttgtgctg 
ccgggattta 
aaggacgggt 
acaggccaac 
cagtgataat 
cagaggaatc 
gggtctgcct 
tggatacata 
tgcattccga 
cgcagcaaaa 
tcaagcagca 
cgggcaacga 
ttatgtcgag 
tgctctcgga 
aaccataatt 
tatgcgcgca 
gttgcgcagt 
tcttctcggg 
gtgggacaaa 

ggtggtggct 

tgccaaaaat 
gaaatttatt 
atatgggagg 
atatgccata 
aacagccecc 
tttttatatt 
ttttactagc 
tcttatgaag 
cctgtgtgaa 
tgtaaagcct 
cccgctttcc 
ccatagtccc 
ctccgcccca 
ctgagctatt 
taacttgttt 
aaataaagca 
ttatcatgtc 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 



gggggcgcgc 

gtgcggcggc 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gcccfcttgtg 
gccgcgtgcg 
gtgcgctccg 
ggctgcgagg 
gtgggcgcgg 
cggcccggct 

cggggggtgg 

c t cgggggag 
agccattgcc 
tggcggagcc 
gtgcggcgcc 
tccccttctc 
cggggcaggg 
catgttcatg 
tctcatcatt 
ccccctaacc 
aaagagtttg 
attgagttat 
tccgttacgt 
aagcagaaga 
gatgctccac 
gacgagggca 
gaggcaatag 
tcfcagagtaa 
gaatcatcac 
gttzggtgatt 
caaagcaaaa 
atatcaatga 
gcatctactc 
cgaaaagcat 
ttgtctgcaa 
cataagtcgg 
attgaaatca 
ggtgtggcca 
fcatggggaca 
ttcattgcaa 
gcaaatcatt 
tgctggctgc 
tgctgtccat 
ttgttttgtg 
cagatttttc 
atccctcgac 
attgttatcc 

ggggtgccta 

agtcgggaaa 
gcccctaact 
tggctgacta 
ccagaagtag 
attgcagctt 
tttttttcac 
tggafcccgct 
gctcttccgc 
tat cage tea 
agaacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 



gecaggeggg 
agecaatcag 
gecctataaa 
cccgctccgc 
ggtgagcggg 
getegfcttet 
egggggggag 
gcccgcgctg 
cgtgtgcgcg 
ggaacaaagg 
eggteggget 
tegggtgegg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 

c ggggt t egg 

ccttcttctt 
ttggcaaaga 
tttatataag 
gattaggcag 
tttcaggaca 
tacattcatg 
cactcataaa 
ttgaagacat 
aggeggegt c 
ctgaaggeca 
ggagatcaag 
catgttggct 
tatgegaaat 
caggegtaaa 
aggaaacact 
gtcgcgaacc 
caggtctttc 
gactctatga 
acaccatggc 
aataagaatt 
atgccctggc 
teatgaagee 
tagtgtgttg 
taaaacatca 
catgaacaaa 
tccttattcc 
ttattttttt 
ctcctctcct 
ctgcagccca 
gctcacaatt 
afcgagtgagc 
cctgtcgtgc 
ccgcccatcc 
atttttttta 
tgaggaggct 
ataatggtta 
tgcattctag 
gcattaatga 
ttcctcgctc 
etcaaaggeg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcaa 
gggctgtgtg 
tcttgagtcc 
gattagcaga 



gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tfctcfcgtggc 
eggctegggg 
cccggcggct 
aggggagege 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 
gggtgccggg 
gccccggagc 
ategtgegag 
aggcgccgcc 
aaatgggcgg 
cteggggctg 
cttctggcgt 
tttcctacag 
attcatggga 
aaacaatgga 
agacaggega 
eaaacaeaag 
gettgatege 
ttacatgagc 
caccacaaaa 
agecaagtta 
tataacaaca 
acttaegget 
cagacttgea 
gaagtggtct 
aattgecate 
tgafcaaafcgc 
gctttcatcc 
cttcgaaggg 
gaagcagata 
atcacagtat 
cactcctcag 
tcacaaatac 
ccttgagcat 
gaafctttttg 
gaatgagfcat 
ggtggctata 
at agaaaagc 
ctttaacatc 
gactactccc 
age 1 1 ggcgt 
ceaeacaaca 
taactcacat 
cagcggatcc 
cgcccctaac 
tttatgeaga 
tttttggagg 
caaataaagc 
ttgtggtttg 
atcggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
tgctcacgct 
cacgaacccc 
aacceggtaa 
gcgaggtatg 



gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggceggggge 
ggtgtgtgcg 
cctgcacccc 
ggggcgtggc 

eggggegggg 

gccggcggct 
agggegcagg 
gcaccccctc 
ggagggcett 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
agaaggegaa 
tattactget 
ategcaatea 
cctctgacag 
tacgaaaaaa 
aaaattaaag 
gaaattgegg 
at cagat caa 
aaccatgtcg 
gacgaatacc 
atggaactgg 
gatategtag 
ccaacagcat 
aaagagattc 
ggcacagtat 
gatccgccta 
agegataagt 
cgtgatgaca 
gtgcaggctg 
cactgagatc 
ctgacttctg 
tgtctctcac 
ttggtttaga 
aagaggtcat 
cttgacttga 
cctaaaattt 
agtcatagct 
aatcatggtc 
tacgagcegg 
taattgcgtt 
gcatctcaat 
tccgcccagt 
ggccgaggcc 
cctaggcttt 
aatagcatca 
tccaaactca 
gegeggggag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggegtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 



540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 
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tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 



ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 

gggtgagcaa 

tgttgaatac 
ctcatgagcg 
acatttcccc 



ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgtgtagat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



-33- 

cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



agaaggacag 
ggt agctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 



4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



<210> 28 
<211> 37 
<212> DHA 

<213> Artificial Sequence 
<220> 

<223> 5PacSV40 Primer 
<400> 28 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Ant i sense Zeo Primer 
<400> 29 

tgaacagggt cacgtcgtcc 

<210> 30 
<211> 1032 
<212> DNA 

<213> Escherichia Coli 



37 



20 



<220> 

<221> CDS 

<222> (1) . . . (1032) 

<223> nucleotide sequence encoding Cre recombinase 
<400> 30 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 



48 



gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 30 



96 



gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
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35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 
Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 240 
Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 2 88 
Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 
85 90 95 

atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 3 36 
Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 
100 105 110 

gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gec ggt 3 84 
Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 
115 120 125 

gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 4 32 
Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 

gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat ata cgt aat 4 80 
Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 
145 150 155 160 

ctg gca ttt ctg ggg att get tat aac acc ctg tta cgt ata gee gaa 528 
Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 
165 170 175 

att gec agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 576 
lie Ala Arg lie Arg Val Lys Asp lie Ser Arg Thr Asp Gly Gly Arg 
180 185 190 

atg tta ate cat att ggc aga acg aaa acg ctg gtt age acc gca ggt 624 
Met Leu lie His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 
Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 

att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 72 0 
He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

egg gtc aga aaa aat ggt gtt gee gcg cca tct gec acc age cag eta 768 
Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 
245 250 255 

tea act cgc gec ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 
Ser Thr Arg Ala Leu Glu Gly lie Phe Glu Ala Thr His Arg Leu He 
260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gee tgg tct gga 864 
Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gec cgt gtc gga gec gcg cga gat atg gee cgc get gga gtt 912 
His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 960 
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Ser lie Pro Glu He Met Gin Ala Gly Gly Trp Thr Asn Val Asrx He 
305 310 315 320 

gtc atg aac tat ate cgt aac ctg gat agt gaa aca ggg gca atg gtg 10 0 8 
Val Met Asn Tyr He Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 
325 330 335 

cgc ctg ctg gaa gat ggc gat tag 1032 
Arg Leu Leu Glu Asp Gly Asp * 
340 

<210> 31 
<211> 343 
<212> PRT 

<213> Escherichia Coli 
<400> 31 

Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 

15 10 15 

Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 

20 25 30 

Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 

35 40 45 

Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 

50 55 60 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 

85 90 95 

Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 

100 105 110 

Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 

115 ' 120 125 

Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 

130 135 140 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp He Arg Asn 
145 ISO 155 160 

Leu Ala Phe Leu Gly He Ala Tyr Asn Thr Leu Leu Arg He Ala Glu 

165 170 175 

He Ala Arg He Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 

180 185 190 

Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 

195 200 205 

Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 

210 215 220 

lie Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 

245 250 255 

Ser Thr Arg Ala Leu Glu Gly He Phe Glu Ala Thr His Arg Leu He 

260 265 270 

Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 

275 280 285 

His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 

290 295 * 300 

Ser He Pro Glu He Met Gin Ala Gly Gly Trp Thr Asn Val Asn He 
305 310 315 320 

Val Met Asn Tyr lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 

325 330 335 

Arg Leu Leu Glu Asp Gly Asp 
340 

<210> 32 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> attBl recognition sequence 



<400> 32 

tgaagcctgc ttttttatac taacttgagc gaa 



33 



<210> 33 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-att recognition sequence 

<2 2 1 > misc_difference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 33 

rkycwgcttt yktrtacnaa stsgb 25 

<210> 34 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attB recognition sequence 

<221> misc_dif f erence 
<222> 18 

<223> n is a or c or g or t/u 



<2X0> 35 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> m-attR recognition sequence 

<221> inisc^diff erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 35 

gttcagcttt cktrtacnaa ctsgb 25 

<210> 36 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attL recognition sequence 

<221> misc^difference 
<222> 18 

<223> n is a or g or c or t/u 



<400> 34 

agccwgcttt yktrtacnaa ctsgb 



25 



<400> 36 

agccwgcttt cktrtacnaa gtsgb 



25 



<210> 37 
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<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> m-attPl recognition sequence 

<221> misc_difference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 37 

gttcagcttt yktrtacnaa gtagb 25 

<210> 38 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 recognition sequence 

<400> 38 

agcctgcttt cttgtacaaa cttgt 25 

<210> 39 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<2 23> attB3 recognition sequence 

<400> 39 

acccagcttt cttgtacaaa cttgt 25 

<210> 40 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl recognition sequence 

<400> 40 

gttcagcttt tttgtacaaa cttgt 25 

<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 recognition sequence 
<400> 41 

gttcagcttt cttgtacaaa cttgt 25 

<210> 42 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 recognition sequence 



<400> 42 
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gttcagcttt cttgtacaaa gttgg 



25 



<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLl recognition, sequence 
<400> 43 

agcctgcttt tttgtacaaa gttgg 25 

<210> 44 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attlj2 recognition sequence 



<210> 45 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL3 recognition sequence 
<400> 45 

acccagcttt cttgtacaaa gttgg 25 

<210> 46 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl recognition sequence 



<210> 47 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2 , P3 recognition sequence 
<400> 47 

gttcagcttt cttgtacaaa gttgg 25 

<210> 48 
<211> 282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP recognition sequence 



<400> 44 

agcctgcttt cttgtacaaa gttgg 



25 



<400> 46 

gttcagcttt tttgtacaaa gttgg 



25 



<400> 48 

ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 60 
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ctgcatatgt tgtgttttac agtattatgt agtctgfcttt ttatgcaaaa tctaatttaa 12 0 
tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 180 
tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 240 
aatcattatt tgatttcaat tttgtcccac tccctgcctc tg 282 

<210> 49 
<211> 1071 
<212> DNA 

<213> Axtificial Sequence 
<220> 

<223> nucleotide sequence encoding Integrase E174R 

<221> CDS 

<222> (1) . . . (1071) 

<223> Integrase E174R 

<400> 49 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 4 8 

Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 15 

tat ata aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 96 
Tyr lie Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 

aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get . 144 
Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg lie Ala lie Thr Glu Ala 
35 40 45 

ata cag gec aac att gag tta ttt tea gga cac aaa cac aag cct ctg 192 
lie Gin Ala Asn lie Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 

aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 240 
Thr Ala Arg lie Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 70 75 80 

gat cgc tac gaa aaa ate ctg gec age aga gga ate aag cag aag aca 2 88 
Asp Arg Tyr Glu Lys lie Leu Ala Ser Arg Gly lie Lys Gin Lys Thr 
85 90 95 

etc ata aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 3 36 
Leu lie Asn Tyr Met Ser Lys lie Lye Ala lie Arg Arg Gly Leu Pro 
100 "* 105 110 

gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 3 84 
Asp Ala Pro Leu Glu Asp lie Thr Thr Lys Glu lie Ala Ala Met Leu 
115 ~ 120 125 

aat gga tac ata gac gag ggc aag gcg gcg tea gec aag tta ate aga 432 
Asn Gly Tyr lie Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu lie Arg 
130 * 135 140 

tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 4 80 
Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala lie Ala Glu Gly His lie 
145 ~ 150 155 160 

aca aca aac cat gtc get gec act cgc gca gca aaa tct aga gta agg 528 
Thr Thr Asn His Val Ala Ala Thr Arg Ala. Ala. Lys Ser Arg Val Arg 
165 170 175 

aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 576 
Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys lie Tyr Gin Ala Ala 
180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 
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GLu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 200 205 

acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 672 
Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Sear Asp XI e 
210 2X5 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 72 0 
Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys lie 
225 230 235 240 

gec ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 768 
Ala lie Pro Thr Ala Leu His lie Asp Ala Leu Gly He Ser Met Lys 
245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 816 
Glu Thr Leu Asp Lys Cys Lys Glu He Leu Gly Gly Glu Thr lie lie 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 864 
Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 

ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat .ccg 912 
Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 

cct acc ttt cac gag ttg cgc agt ttg tct gca aga etc tat gag aag 960 
Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys 
305 310 315 320 

cag ata age gat aag ttt get caa cat ctt etc ggg cat aag teg gac 100 8 
Gin He Ser Asp Lys Phe Ala Gin His Leu Leu Gly His Lys Ser Asp 
325 330 335 

acc atg gca tea cag tat cgt gat gac aga ggc agg gag tgg gac aaa 1056 
Thr Met Ala Ser Gin Tyr Arg Asp Asp Arg Gly Arg Glu Trp Asp Lys 
340 ' 345 ~ 350 

att gaa ate aaa taa 1071 
lie Glu lie Lys * 
355 

<210> 50 
<211> 356 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Integrase E174R 
<400> 50 

Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 

1 5 10 15 

Tyr lie Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 

20 25 30 

Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg lie Ala lie Thr Glu Ala 

35 40 45 

lie Gin Ala Asn lie Glu Leu Phe Ser Gly His Lye His Lys Pro Leu 

50 55 60 

Thr Ala Arg lie Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 70 75 80 

Asp Arg Tyr Glu Lys lie Leu Ala Ser Arg Gly Xle Lys Gin Lys Thr 

85 90 95 

Leu lie Asn Tyr Met Ser Lys lie Lys Ala He Arg Arg Gly Leu Pro 

100 105 110 

Asp Ala Pro Leu Glu Asp He Thr Thr Lys Glu He Ala Ala Met Leu 
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115 










120 


Asn 


130 


x y x 


He 


Asp 


Glu 


Gly 
135 


Lys 


Ser 


Thr 


Leu 


Ser 


Asp 


Ala 


Phe 




X45 










150 






Thr 


Thr 


As n 


His 


Val 
165 


Ala 


Ala 


Thr 


Ara 


Ser 




Leu 


Thr Ala Asp 


Glu 








180 










Glu 


Ser 


Ser 
195 


Pro 


Cys 


Trp 


Leu 


200 


Thr 




Gin 


AtTCf 


Val 


Gly Asp 


Leu 




210 










215 




Val 


A.sp 


Glv 


Tvr 


Leu 


Tyr 


Val 


Glu 


225 










230 






Ala 


He 


Pro 


Thr 


Ala 
245 


Leu 


His 


He 


Glu 


Thr 


Leu 


Asp 
260 


Lys 


Cys 


Lys 


Glu 


Ala 


Ser 


Thr 
275 


Arg 


Arg 


Glu 


Pro 


Leu 
280 


Phe 


Met 
290 


Arg 


Ala 


Arg 


Lys 


Ala 
295 


Ser 


Pro 


Thr 


Phe 


His 


Glu 


Leu 


Arg 


Ser 


305 










310 






Gin 


He 


Ser 


Asp 


Lys 


Phe 


Ala 


Gin 








325 








Thr 


Met 


Ala 


Ser 
340 


Gin 


Tyr 


Arg 


Asp 


lie 


Glu 


He 


Lys 














355 











125 



Ala 


Ala 


Ser 


Ala 


Lys 


Leu 


He 


Arg 








140 








Glu 


Ala 


He 


Ala 


Glu 


Gly 


His 


He 






155 








160 


Arg 


Ala 


Ala 


Lys 


Ser 


Arg 


Val 


Arg 




170 










175 




Tyr 


Leu 


Lys 


He 


Tyr 


Gin 


Ala 


Ala 


185 










190 






Leu 


Ala 


Met 


Glu 


Leu 


Ala 


Val 


Val 










205 








Cys 


Glu 


Met 


Lys 


Trp 


Ser 


Asp 


lie 








220 










Gin 


Ser 


Lys 


Thr 


Gly 


Val 


Lys 


He 






235 










240 


Asp 


Ala 


Leu Gly 


He 


Ser 


Met 


Lys 




250 










255 




He 


Leu 


Gly Gly Glu 


Thr 


He 


lie 


265 










270 






Ser 


Ser 


Gly 


Thr 


Val 


Ser 


Arg 


Tyr 










285 








Gly 


Leu 


Ser 


Phe 


Glu Gly 


Asp 


Pro 








300 










Leu 


Ser 


Ala 


Arg 


Leu 


Tyr 


Glu 


Lys 






315 










320 


His 


Leu 


Leu 


Gly His 


Lys 


Ser 


Asp 




330 










335 




Asp 


Arg 


Gly Arg 


Glu 


Trp 


Asp 


Lys 


345 










350 







<210> 51 

<211> 34 

<212> DNA 

<213> Artificial 

<220> 

<223> Lox P Site 



Sequence 



<400> 51 

ataacttcgt ataatgtatg ctatacgaag ttat 



34 
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AMENDED CLAIMS 

[received by the International Bureau on 24 December 2002 (24.12.02); 
original claims 3, 9, 16, 20, 35, 52, 56, 80, 101, 105, 107, 111, 116, 123 and 128-132 amended; 

remaining claims unchanged (17 pages)] 

1 • A method for producing an artificial chromosome, comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 
5 selecting a cell comprising an artificial chromosome that 

comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat 

region; 

10 repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1 , wherein the artificial chromosome is 
15 predominantly made up of one or more repeat regions. 

3. The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or that targets the nucleic acid to an 
amplifiable region of a plant chromosome. 

20 4. The method of claim 1 , wherein the nucleic acid introduced into 

the cell comprises one or more nucleic acids selected from the group 
consisting of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises 
plant rDNA. 

25 6. The method of claim 5, wherein the rDNA is from a plant 

selected from the group consisting of Arabidopsis, Nicotiana, Solatium, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises 
animal rDNA. 

30 8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises 
rDNA comprising a sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 Sofanum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of 
cells containing the nucleic acid. 
lO 12. The method of claim 11, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

13. The method of claim 12, wherein the protein is a green 
fluorescent protein.. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1, wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ 
hybridization (FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Heiianthus chromosomes. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

20. An isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat 

region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

5 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
10 artificial chromosome is produced by the method of claim 1 or claim 2. 

23. A method of producing a transgenic plant, comprising 
introducing the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

15 25. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product selected from the group consisting of enzymes, antisense 
RNA, tRNA, rDNA, structural proteins, marker proteins, iigands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
20 encodes a product selected from the group consisting of vaccines, blood 

factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

25 28. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product that provides for an agronomically important trait in the 
plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
30 nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucleic acid 
is contained within a bacterial artificial chromosome (BAC) or a yeast 
artificial chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic 
DNA from a first species of plant; 

introducing the artificial chromosome into a plant cell of a 
second species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a 
neo-centromere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanarnycin, hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first 
plant species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 

5 and the artificial chromosome comprises a site-specific recombination 
sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 
and the artificial chromosome comprises a site-specific recombination 

10 sequence that is complementary to the site-specific recombination sequence 
of the plant cell of a first plant species. 

44. The method of claim 39, wherein the site-specific 
recombination is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
15 comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 
20 introducing a recombinase activity into the plant cell, wherein 

the activity catalyzes recombination between the first and second 
chromosomes and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

25 47 - T he method of claim 45, wherein the second nucleic acid is 

introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and 
the second nucleic acid is introduced into the distal end of the arm of the 

30 second chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 



recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 



plant cell, a recombination site and a recombinase coding region in operative 
20 linkage into a second plant cell; 



containing the nucleic acid encoding the selectable marker; and 

selecting a resistant plant that contains cells comprising an 
acrocentric plant chromosome. 

51 . The method of any of claims 45-50, wherein the DNA of the 
short arm of the acrocentric chromosome contains less than 5% euchromatic 
DNA. 

52. The method of claim 51, wherein the DNA of the short arm of the 
acrocentric chromosome contains less than 1 % euchromatic DNA. 



introducing a second nucleic acid comprising a site-specific 



generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a 



generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

lO cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome that is 
predominantly heterochromatic. 
15 57 - Tne method of claim 56, wherein. the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
25 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

30 60 - The method of claim 4, wherein the nucleic acid comprises plant 

rDNA from a monocot plant species. 
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10 67. 



61 . The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant 
species. 

64. The method of claim 62, wherein the plant is a monocot plant 
species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1, wherein the cell is a monocot plant cell. 
An isolated plant artificial chromosome comprising one or more 

repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
1 5 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 
20 introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that represent 

euchromatic and heterochromatic nucleic acid. 

69. The method of claim 44, wherein the recombinase is selected 
from the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

30 70 ' The ^thod of claim 50, further comprising selecting first and 

second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71 . The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific 

recombination sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73 ■ Tn <* method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, 

wherein the chromosome contains adjacent regions of rDNA and 
heterochromatic DNA; 
25 culturing the cell through at least one cell division; and 

selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
30 chromosome into which the nucleic acid is introduced is an acrocentric 

chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of claim 76, 77, or 79, wherein the 
heterochromatic DNA is pericentric heterochromatin. 

5 81 . A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. 

82. The vector of claim 81, wherein the amplifiable region 
15 comprises heterochromatic nucleic acid. 

83. The vector of claim 81, wherein the amplifiable region 
comprises rDNA. 

84. The vector of claim 81, wherein the sequence of nucleotides 
that facilitates amplification of a region of a plant chromosome or targets the 

20 vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to facilitate amplification or 
effect the targeting. 

85. The vector of claim 84, wherein the sufficient portion contains 
at least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from 

25 an intergenic spacer region. 

86. The vector of claim 81, wherein the selectable marker encodes 
a product that confers resistance to zeomycin. 

87. A plant transformation vector, comprising: 
a recognition site for recombination; 

30 a sequence of nucleotides that facilitates amplification of a 
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region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome; and 

one or more selectable markers that when expressed in a plant 
cell permit the selection of the cell; wherein 
5 the plant transformation vector is for Agrobacteriummed'iated 

transformation of plants. 

88. The vector of claim 81, wherein the recognition site comprises 
an att site. 

89. The vector claim 81 , that is pAglla or pAgllb. 
lO 90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits. growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
15 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

9 1 . The vector of claim 90, wherein the recognition site comprises 
an att site. 

92. The vector of claim 90, further comprising a sequence of 

20 nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline 
synthase (NOS) or CaMV35S. 

94. The vector of claim 93 that is pAgl or pAg 2. 

25 95. The vector of claim 92, wherein the amplifiable region 

comprises heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region 
comprises rDNA. 

97. The vector of claim 96, wherein the sequence of nucleotides 
30 that facilitates amplification of a region of a plant chromosome or targets the 
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vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to effect the amplification or 
the targeting. 

98. The vector of claim 90, wherein the protein is a selectable 
5 marker that permits growth of plant cells in the presence of an agent 

normally toxic to the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 

100. The vector of claim 90, wherein the protein is a fluorescent 
10 protein. 

101. The vector of claim 100, wherein the fluorescent protein is 
selected from the group consisting of green, blue and red fluorescent proteins. 

102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
1 5 associated with any promoter, wherein the selectable marker permits growth 
of plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
20 103. A vector, comprising: 

a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiable region of 
a plant chromosome, wherein the plant is selected from the group consisting 
25 of Arabidopsis, Nicotlana, Solatium, Lycopers/con, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Hel/anthus, Glycine, soybean, Gossypium, cotton, 
Hellanthus, sunflower and Oryza. 

104. The vector of claim 103, wherein the recognition site comprises 
an att site. 

30 105. A cell, comprising a vector of any of claims 81-86 and 88-1 04. 

106. The cell of claim 105 that is a plant cell. 
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107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site 
that recombines with the recognition site in the vector in the presence of the 
recombinase therefor, thereby incorporating the selectable marker that is not 
operably associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

108. The method of claim 107, wherein the recombination sites are 
att sites. 

109. The method of claim 107, wherein the animal is a mammal. 

110. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable 
marker that in the vector is not operably associated with a promoter. 

111. The method of any of claims 107-1 10, further comprising, 
transferring the resulting platform ACes into a plant cell to produce a plant 
cell that comprises the platform Aces. 

I 12. The method of claim 111, wherein the resulting platform ACes 
is isolated prior to transfer. 

I I 3. The method of claim 111, wherein the isolated ACes is 
introduced into a plant cell by a method selected from the group consisting of 
protoplast transfection, lipid-mediated delivery, liposomes, electroporation, 
sonoporation, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation, polyethylene glycol (PEG)-mediated DNA uptake, 
lipofection and lipid-mediated carrier systems. 

114. The method of claim 111, wherein the resulting platform ACes 
is transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant 
protoplasts. 

116. The method of claim 107, wherein the cell is an animal cell. 
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117. The method of claim 116, wherein the animal cell is a 
mammalian cell. 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 

5 encoded by the nucleic acid that is operably linked to a plant promoter is 
expressed. 

119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 
1 0 selecting a plant cell comprising an artificial chromosome that comprises 

one or more repeat regions. 

120. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

15 121. The method of claim 119 or claim 1 20, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

20 the repeat region(s) contain substantially equivalent amounts of 

euchromatic and heterochromatic nucleic acid. 

122. The method of claim 119, further comprising isolating the 
artificial chromosome. 

123. A method, comprising: 

25 introducing a vector into a cell, wherein: 

i) the vector comprises: 

a) nucleic acid encoding a selectable marker that is 
not operably associated with any promoter, wherein the 
selectable marker permits growth of animal cells in the presence 
SO of an agent normally toxic to the animal cells; and wherein the 

agent is not toxic to plant cells; 



229 

AMENDED SHEET (ARTICLE 19) 



WO 2002/096923 



PCT7US2002/017451 



b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 
5 a platform plant artificial chromosome (PAC) that 

comprises a recombination site and an animal promoter that upon 
recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a 

promoter; 

10 iii) introduction is effected under conditions whereby 

the vector recombines with the PAC to produce a plant platform PAC that 
contains the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein 
encoded by nucleic acid operably linked to an animal promoter is expressed. 

15 124. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
20 the cell comprises nucleic acid encoding a selectable marker. 

127. The vector of claim 81, further comprising one or more selectable 
markers that when expressed in the plant cell permit the selection of the cell. 

128. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

25 comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid sequences; and 
30 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 
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129. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
5 comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
10 euchromatic and heterochromatic nucleic acid. 

130. The method of claim 123, wherein the cell into which the vector 
is introduced is an animal cell. 

131 . The method of claim 130, wherein the cell is a mammalian cell. 

132. The method of claim 78, wherein the heterochromatic DNA is 
15 pericentric heterochromatin. 
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PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS 
OF PREPARING PLANT ARTIFICIAL CHROMOSOMES 

RELATED APPLICATIONS 

Benefit of priority is claimed to U.S. Provisional Application No. 
5 60/294,687, filed May 30, 2001 , by CARL PEREZ AND STEVEN 

FABIJANSKI entitled PLANT ARTIFICIAL CHROMOSOMES, USES THEREOF 
AND METHODS FOR PREPARING PLANT ARTIFICIAL CHROMOSOMES and 
to U.S. Provisional Application No. 60/296,329, filed June 4, 2O01, by CARL 
PEREZ AND STEVEN FABIJANSKI entitled PLANT ARTIFICIAL 

10 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING PLANT 
ARTIFICIAL CHROMOSOMES. This application is related to U.S. Provisional 
Application No. 60/294,758, filed May 30, 2001, by EDWARD PERKINS et 
al.. entitled CHROMOSOME-BASED PLATFORMS and to U.S. Provisional 
Application No. 60/366,891, filed March 21, 2002, by by EDWARD 

15 PERKINS efa/.. entitled CHROMOSOME-BASED PLATFORMS. This 

application is also related to U.S. Provisional Application Attorney Docket 
No. 24601-420, filed May 30, 2002, by EDWARD PERKINS etal.. entitled 
CHROMOSOME-BASED PLATFORMS and to PCT International Patent 
Application Attorney Docket No. 24601 -420PC, filed May 30, 2002, by 

20 EDWARD PERKINS et al.. entitled CHROMOSOME-BASED PLATFORMS. 
This application is related to U.S. application Serial No. 08/695,191, filed 
August 7, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,025,155. 

25 This application is also related to U.S. application Serial No. 08/682,080, 

filed July 15, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES, now U.S. Patent No. 6,077,697. 
This application is also related U.S. application Serial No. 08/629,822, filed 

30 April 10, 1996 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
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ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned), and is also 
related to copending U.S. application Serial No. 09/096,648, filed June 12, 
1998, by GYULA HADLACZKY and ALADAR SZALAY, entitled ARTIFICIAL 
5 CHROMOSOMES, USES THEREOF AND METHODS FOR PREPARING 

ARTIFICIAL CHROMOSOMES and to U.S. application Serial No. 09/835,682, 
April 10, 1997 by GYULA HADLACZKY and ALADAR SZALAY, entitled 
ARTIFICIAL CHROMOSOMES, USES THEREOF AND METHODS FOR 
PREPARING ARTIFICIAL CHROMOSOMES (now abandoned). This 

10 application is also related to copending U.S. application Serial No. 
09/724,726, filed November 28, 2000, U.S. application Serial No. 
09/724,872, filed November 28, 2000, U.S. application Serial No. 
09/724,693, filed November 28, 2O00, U.S. application Serial No. 
09/799,462, filed March 5, 2001, U.S. application Serial No. 09/836,911, 

15 filed April 17, 2001, and U.S. application Serial No. 10/125,767, filed April 
17, 2002, each of which is by GYULA HADLACZKY and ALADAR SZALAY, 
and is entitled ARTIFICIAL CHROMOSOMES, USES THEREOF AND 
METHODS FOR PREPARING ARTIFICIAL CHROMOSOMES. This application 
is also related to International PCT application No. WO 97/40183. Where 

20 permitted the subject matter of each of these applications is incorporated by 
reference in its entirety. 
FIELD OF THE INVENTION 

Artificial chromosomes and methods of producing artificial 
chromosomes, particularly for use in delivery of nucleic acids and expression 

25 thereof in plants are provided. Also provided are methods of use of artificial 
chromosomes in the delivery of nucleic acids to host cells, including plant 
cells, and the expression of the nucleic acids therein. The resulting plant 
cells, tissues, organs and whole plants containing the artificial chromosomes, 
plant cell-based methods for production of heterologous proteins and 
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methods of producing transgenic organisms, particularly plants, using the 
artificial chromosomes are provided. 
BACKGROUND OF THE INVENTION 

The stable transfer of nucleic acids into plant cells and the expression 
5 of the nucleic acids therein poses many challenges. Many efforts at the 
stable introduction of nucleic acids into plant cells have utilized 
Agrobacterium-medtated transformation. Agrobacterium is a free-living 
Gram-negative soil bacterium. Virulent strains of this bacterium are able to 
infect plant tissue and induce the production of a neoplastic growth 

10 commonly referred to as a crowngall. Virulent strains of Agrobacterium 
contain a large plasmid DNA known as a Ti-plasmid that contains genes 
required for DNA transfer [vir genes) and replication as well as a region of 
DNA that is transferred to plant cells called T-DNA. The T-DNA region is 
bordered by T-DNA border sequences that are crucial to the DNA transfer 

15 process. These T-DNA border sequences are recognized by the vir genes 
encoded on the Ti-plasmid and the vir genes are responsible for the DNA 
transfer process. 

Most wild-type Agrobacterium have a relatively broad dicot plant host 
range and are capable of transferring T-DNA regions up to 25 kiiobases of 

20 DNA {e.g., nopaline strains) or more {e.g. , octopine strains). Accordingly, 
numerous methods of using Agrobacterium to transfer DNA into plant cells 
have been developed based on the engineering of the Ti-plasmid to no longer 
contain the genes responsible for altered morphology and replacing these 
genes with a recombinant gene encoding a trait of interest. There are two 

25 primary types of Agrobacterium-based plant transformation systems, binary 
[see, e.g., U.S. Patent No. 4,940,838] and co-integrate [see, e.g., Fraley et 
ai. (1985) Biotechnology 3:629-635] methods. The T-DNA border repeats 
are maintained in both systems and the natural DNA transfer process is used 
to transfer the portion of DNA located between the T-DNA borders into the 

30 plant cell. 
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Another plant cell transformation system, termed biolistics, involves 
the bombardment of plant cells with microscopic particles coated with DNA 
encoding a new trait. The particles are rapidly accelerated, typically by gas 
or electrical discharge, through the cell wall and membranes, whereby the 
5 DNA is released into the cell and is incorporated into the genome of the cell. 
This method is used for transformation of many crops, including corn, wheat, 
barley, rice, woody tree species and others. 

A significant number of crop species of commercial interest have been 
transformed using either Agrobacterium-medi'iated or biolistic systems. 

10 However, these methods have many limitations that limit their utility. For 
example, there are limits to the size of the heterologous DNA that can be 
transferred using these methods; typically, only one to two genes may be 
transferred. Thus, although these methods may have utility in producing 
crop products modified to contain a single new trait, such as insect or 

15 herbicide tolerance, they may not be sufficient to transfer DNA that will 
provide for multiple traits, or very large DNA segments encoding a 
multiplicity of traits. 

In addition, the genetically modified plant cells produced by these 
methods tend to contain the transferred DNA in euchromatic regions of the 

20 genomic DNA. Typically, a large number of independent transgenic insertion 
events must be screened before a suitable event (such as insertion of a gene 
into the host genomic DNA such that it provides a sufficient level of gene 
expression within temporal and spatial expectations and without evidence of 
gene rearrangement) is identified. 

25 Another limitation of these methods is the effort required to utilize 

them in the genetic modification of many commercially important crops. For 
example, transformation efficiency can vary with the crop and can be low, 
notably in cereal crops such as corn and wheat. Often the inserted genes 
are rearranged and unstable over generations. 
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Furthermore Agrobacterium tumefaciens relies on host-parasite 
interaction in order to be successful. This has the effect that Agrobacterium 
has a preference for some dicots, while other dicots, monocots and conifers 
are resistant to transformation via Agrobacterium. Self-replicating vectors 
5 have also been used in the transfer of nucleic acids into plant cells. Such 
episoma! vectors contain DNA sequences that are required for DNA 
replication and sustainability of the vector in a living cell. In higher plants, 
very few episomal vectors have been developed. These episomal vectors 
have the drawback of having a very limited capacity for carrying genetic 

10 information and are unstable. One example of an episomal plant vector is 
the Cauliflower Mosaic Virus [Brisson era/. (1984) Nature 370:511], 
Limitations of these gene delivery technologies necessitate the 
development of alternative vector systems suitable for transferring large (up 
to Mb size or larger) genes, gene complexes, and multiple genes together 

15 with regulatory elements for safe, controlled, and persistent expression of 
the desired genetic material in higher organisms, particularly plants, without 
rearrangement caused by insertion or mutagenesis. Therefore, it is an object 
herein to provide artificial chromosomes for the introduction of large nucleic 
acids into eukaryotic cells and methods using the artificial chromosomes, 

20 particularly for the introduction and expression of nucleic acids in plants. 
SUMMARY OF THE INVENTION 

Provided herein are plant artificial chromosomes and methods for 
producing plant artificial chromosomes. The artificial chromosomes are fully 
functional stable chromosomes. Plant artificial chromosomes provided herein 

25 have a particular composition that makes them ideal vectors for stable, 

controlled, high-level expression of heterologous nucleic acids in plant cells. 
The artificial chromosomes are capable of independent, extra-genomic 
maintenance, replication and segregation within cells and can carry multiple, 
large heterologous genes. 
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Artificial plant chromosomes provided herein are non-natural 
chromosomes that exhibit an ordered segmentation that distinguishes them 
from naturally occurring chromosomes. The segmented appearance can be 
visualized using a variety of chromosome analysis techniques and correlates 
5 with the unique structure of these artificial chromosomes, which, in 

particular methods of producing these chromosomes, can arise through 
amplification of chromosomal segments (i.e., amplification-based artificial 
chromosomes). The artificial chromosomes, throughout the region or regions 
of segmentation, are predominantly made up of one or more nucleic acid 

10 units that is (are) repeated in the region (referred to as the repeat region) and 
that have a similar gross structure. Repeats of a nucleic acid unit tend to be 
of similar size and share some common nucleic acid sequences, for example, 
a replication site involved in amplification of chromosome segments and/or 
some heterologous nucleic acid. Although the size of a repeating nucleic 

15 acid unit can vary, typically they tend to be greater than about 100 kb, 

greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. Typically, repeats of a nucleic acid unit are 
substantially similar in nucleic acid composition and can be nearly identical. 
The common nucleic acid sequences can contain sequences that represent 

20 euchromatic and heterochromatic nucleic acid. The composition of the 

amplification-based artificial chromosomes can be such that substantially the 
entire chromosome exhibits a segmented appearance or such that only one 
or more portions that make-up less than the entire chromosome appear 
segmented. 

25 The composition of the plant artificial chromosomes provided herein 

can vary. For example, in some of the artificial chromosomes provided 
herein, the repeat region or regions can be made up predominantly of 
heterochromatic DNA (i.e., the repeat region or regions contain more 
heterochromatic DNA than other types of DNA, e.g., euchromatic DNA). In 

30 other artificial chromosomes provided herein, the repeat region or regions can 
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be made up predominantly of euchromatic DNA (i.e., the repeat region or 
regions contain more euchromatic DNA than other types of DNA, e.g., 
heterochromatic DNA) or can be made up of substantially equivalent 
amounts of heterochromatic and euchromatic DNA, e.g. , about 40% to 
5 about 50% of one type of nucleic acid and about 50% to about 60% of the 
other type of nucleic acid. The repeat region or regions thus can be entirely 
heterochromatic (while still containing one or more heterologous genes), or 
can contain increasing amounts of euchromatic DNA, such that, for example, 
the region contains about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 

10 90% or greater than 90% euchromatic DNA. Common nucleic acid 

sequences within repeated nucleic acid units in a repeat region can contain 
DNA that represents euchromatic nucleic acid and DNA that represents 
heterochromatic nucleic acid. Because the entire artificial chromosome can 
be made up predominantly of a repeat region or regions [e.g. , the 

15 composition of the chromosome is such that the repeat region or regions 
make up greater than about 50% or greater than about 60% of the 
chromosome), it is thus possible for the artificial chromosome to be made up 
predominantly of heterochromatin or euchromatin, or to be made up of 
substantially equivalent amounts of heterochromatin and euchromatin, e.g. , 

20 about 40% to about 50% of one type of nucleic acid and about 50% to 

about 60% of the other type of nucleic acid. Plant artificial chromosomes 
provided herein can be isolated or contained within cells or vesicles. 

Also provided herein are cells containing plant artificial chromosomes 
as described herein, including plant cells and animal cells, included among 

25 the cells containing the plant artificial chromosomes are any cells that include 
one or more plant chromosomes. Included, for example, are plant cells, 
including plant protoplasts, in culture and within plant tissues, organs, seeds, 
pollen or whole plants. Plant cells containing the plant artificial 
chromosomes can be from any type of plant, including monocots and dicots. 

30 For example, the plant cells can be from Arabidopsis, Nicotiana, Sofanum, 
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Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum, Helianthus, 
Oryza, Glycine (soybean), gossypium (cotton). Also contemplated are 
mammalian and other animal cells that contain plant ACs 

Plant cells containing artificial chromosomes of any species are also 
5 provided herein. Thus, for example, such plant cells can contain an artificial 
chromosome containing an animal, e.g., mammalian, centromere or an insect 
or avian centromere. Included among the artificial chromosomes contained 
within plant cells as provided herein are predominantly heterochromatic 
[formerly referred to as satellite artificial chromosomes (SATACs); see, e.g. , 

10 U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183], minichromosomes which contain a de novo 
centromere, artificial chromosomes containing one or more regions of 
repeating nucleic acid units wherein the repeat region(s) contain substantially 
equivalent amounts of euchromatic and heterochromatic nucleic acid and in 

15 vitro assembled artificial chromosomes, each from any species. An 
exemplary artificial chromosome is a mammalian satellite artificial 
chromosome containing a mouse centromere. Included among the plant cells 
containing artificial chromosomes of any species are plant cells, including 
plant protoplasts, in culture and within plant tissues, organs, seeds, pollen or 

20 whole plants. Plant cells containing the artificial chromosomes can be from 
any type of plant, including monocots and dicots. For example, the plant 
cells can be from Arabidopsis, Nicotiana, Solanum, Lycopersicon, Daucus, 
Hordeum, Zea mays, Brassica, Triticum, Helianthus and Oryza. 

Further provided herein are methods of producing plant artificial 

25 chromosomes. One embodiment of these methods includes the steps of 
introducing nucleic acid into a cell containing plant chromosomes and 
selecting a cell containing an artificial chromosome that contains one or more 
repeat regions in which one or more nucleic acid units is (are) repeated. The 
repeats of a nucleic acid unit in a repeat region can contain common nucleic 

30 acid sequences and can be substantially identical. In some embodiments of 
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this method, the repeat region(s) of the artificial chromosome contain 
substantially equivalent amounts of euchromatic and heterochromatic nucleic 
acid. The artificial chromosome can be predominantly made up of one or 
more repeat regions. In further embodiments of this method, the artificial 
5 chromosome is made up of substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. In further embodiments of this method, 
the repeats of a nucleic acid unit have common nucleic acid sequences 
which contain sequences that represent euchromatic and heterochromatic 
nucleic acid. 

10 Any cell containing plant chromosomes can be used in these 

embodiments of methods of producing plant artificial chromosomes described 
herein. For example, the cell can be any cell that contains chromosomes 
from Arabidopsis, tobacco, Sofanum, Lycopersicon, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Oryza, Capsicum, lentil and/or Hetianthus, including 

1 5 cells or protoplasts of Arabidopsis, tobacco and/or Hetianthus. 

The nucleic acid that is introduced into a cell containing plant 
chromosomes in methods of producing a plant artificial chromosome as 
provided herein can be any nucleic acid, including, but not limited to, satellite 
DNA, rDNA and lambda phage DNA. Satellite DNA and rDNA includes such 

20 DNA from plants, such as, for example, Arabidopsis, Nicotiana, Solanum, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza, 
and from animals, such as mammals. The rDNA can contain sequences of 
an intergenic spacer region, such as can be obtained, for example, from DNA 
of Arabidopsis, Soianum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, 

25 radish and mung bean. In some embodiments of the method, the nucleic 

acid contains a nucleic acid sequence that facilitates amplification of a region 
of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

In further embodiments of methods of producing plant artificial 
30 chromosomes provided herein, the nucleic acid that is introduced into a cell 
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containing one or more plant chromosomes includes nucleic acid that for 
identification of cells containing the nucleic acid. Such nucleic acids include 
nucleic acid encoding a fluorescent protein, such as a green, blue or red 
fluorescent protein, and nucleic acid encoding a selectable marker, such as, 
5 for example, proteins that confer resistance to phosphinothricin, ammonium 
glufosinate, glyphosate, kanamycin, hydromycin, dihydrofolate or 
sulfonylurea. 

In embodiments of methods of producing plant artificial chromosomes 
in which nucleic acid is introduced into a cell containing one or more plant 

10 chromosomes, the cell can be cultured through two or more cell doublings, 
and typically from about 5 to about 60, or about 5 to about 55, or about 10 
to about 55, or about 25 to about 55, or about 35 to about 55 cell doublings 
following introduction of nucleic acid into a cell. The step of selecting a cell 
containing a plant artificial chromosome can include sorting of cells into 

15 which nucleic acid was introduced. For example, cells can be sorted on the 
basis of the presence of a selectable marker, such as a reporter protein, or 
. by growing (culturing) the cells under selective conditions. The selection 
step can include fluorescent in situ hybridization (FISH) analysis of cells into 
which nucleic acid is introduced. 

20 Also provided are methods of producing a transgenic plant using 

artificial chromosomes that function in plants and transgenic plants 
containing artificial chromosomes. Artificial chromosomes used in the 
methods of producing transgenic plants can be of any species. For example, 
the artificial chromosomes can contain a centromere from species such as 

25 animals, e.g. , mammals, birds, plants, or insects, that functions to segregate 
nucleic acids to daughter cells through cell division. In some embodiments 
of the methods for producing a transgenic plant, the artificial chromosomes 
contain repeat regions predominantly made up of repeats of one or more 
nucleic acid units. Repeats of a nucleic acid unit can share some common 

30 nucleic acid sequences, for example, a replication site involved in 
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ampiification of chromosome segments and/or some heterologous nucleic 
acid. Repeats of a nucleic acid unit can be substantially identical. Common 
nucleic acid sequences of repeats of a nucleic acid unit can contain 
sequences that represent euchromatic and heterochromatic nucleic acid. 
5 Repeat regions of artificial chromosomes that can be used in the 

methods of producing a transgenic plant can be made up of substantially 
equivalent amounts of heterochromatic and euchromatic DNA or can be 
made up predominantly of heterochromatic DNA or can be made up 
predominantly of euchromatic DNA. The artificial chromosome can be made 

10 up predominantly of heterochromatic or euchromatic DNA or can be made up 
of substantially equivalent amounts of heterochromatin and euchromatin. 
Such artificial chromosomes that contain plant centromeres can contain a 
plant centromere from any species of plant, including monocots and dicots. 
For example, the centromere can be from Arabidopsis, tobacco, Helianthus, 

15 Solatium, Lycopersicon, Daucus, Hordeum, Zea. Brassica, Triticum, rye, 
wheat, radish, mung bean or Oryza. The artificial chromosomes can be 
made using methods described herein. 

In a method of producing a transgenic plant provided herein, an 
artificial chromosome, such as those described above and elsewhere herein, 

20 is introduced into a plant cell. The artificial chromosome can contain 

heterologous nucleic acid encoding a gene product such as, for example, an 
enzyme, antisense RNA, tRNA, rDNA, a structural protein, a marker or 
reporter protein, a ligand, a receptor, a ribozyme, a therapeutic protein, a 
biopharmaceutical protein, a vaccine, a blood factor, an antigen, a hormone, 

25 a cytokine, a growth factor or an antibody. The product can be one that 

provides for resistance to diseases, insects, herbicides or stress in the plant. 
The product can be one that provides for an agronomically important trait in 
the plant and/or that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. Heterologous nucleic acid of an artificial 
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chromosome can be contained within a bacterial artificial chromosome (BAC) 
or a yeast artificial chromosome (YAC). 

The plant cell into which such artificial chromosomes can be 
introduced in methods of producing a transgenic plant provided herein can be 
5 any species of plant cell, including, but not limited to, Arabidopsis, tobacco, 
Heh'anthus, Solarium, Lycopersicon, Daucus, Hordeum , Zea, Brassica, 
Triticum, rye, wheat, radish, mung bean, Capsicum, lentil and Oryza. Any 
cell that can develop into a plant can be used, including plant cells and 
protoplasts of plant embryos, calli, tissues, meristem, organs, seeds, 

10 seedlings, pollen, pollen tubes or whole plants. 

Artificial chromosomes can be introduced into plant cells in the 
methods of producing a transgenic plant using any process for transfer of 
nucleic acids into plant cells, including, but not limited to chemical, physical 
and electrical processes and combinations thereof. For example, the artificial 

15 chromosomes can be transferred into plant cells via direct contact in the 

absence or presence of a fusogen, e.g., polyethylene glycol (PEG), calcium 
phosphate and/or lipid or they can be encapsulated in a lipid structure (e.g. , a 
liposome) or contained within a protoplast or microcell which is then allowed 
to fuse (in the presence or absence of a fusogen such as PEG) with a plant 

20 cell for introduction of the artificial chromosome into the cell in a method of 
producing a transgenic plant. Artificial chromosomes can be transferred to 
plant cells that are subjected to electrical pulses {e.g. , electroporation) and/or 
ultrasound (e.g., sonoporation) before, during and/or after exposure of the 
cells to the artificial chromosomes. Use of electrical pulses and/or ultrasound 

25 can be in combination with any other agents, e.g. , PEG and/or lipids, used in 
transferring nucleic acids into plant cells. Artificial chromosomes can also be 
physically injected into plant cells through a micropipette or needle or 
introduced into plant cells through bombardment of the cells with 
microprojectiles coated with the chromosomes. To facilitate transfer of 



WO 2002/096923 



PCTAUS2002/0I7451 



-13- 

nucleic acids into plant cells, the recipient cells or tissue can be subjected to 
mechanical wounding. 

Plant cells into which artificial chromosomes have been introduced for 
purposes of producing a transgenic plant are cultured under conditions that 
5 permit generation of a whole plant therefrom. The transformed cells can be 
analyzed prior to use in the generation of whole plants to determine 
suitability. For example, the cells can be analyzed for the presence of 
artificial chromosomes and/or regenerative capacity. Plant regeneration 
techniques, many of which are known to those of skill in the art, can be 
10 used to generate whole plants from, for example, cells, embryos and calli 
containing artificial chromosomes. For example, plants can be regenerated 
from cells containing artificial chromosomes by the planting of transformed 
roots, plantlets, seed, seedlings, and any structure capable of growing into a 
whole plant. 

15 Further provided herein are methods for producing an acrocentric plant 

chromosome and methods for producing plant chromosomes containing 
adjacent regions of rDNA and heterochromatin, in particular, pericentric 
and/or satellite heterochromatin. Also provided herein are methods for 
generating acrocentric plant chromosomes containing adjacent regions of 

20 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

One embodiment of these methods includes steps of introducing 
nucleic acid containing two site-specific recombination sites into a cell 
containing one or more plant chromosomes, recombining nucleic acids of the 

25 two site-specific recombination sites, and selecting a cell containing an 
acrocentric plant chromosome and/or a plant chromosome containing 
adjacent regions of rDNA and heterochromatin. The two site-specific 
recombination sites can be contained on separate nucleic acid fragments 
which are introduced into the cell simultaneously or sequentially. 
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Other embodiments of the methods of producing an acrocentric plant 
chromosome and/or a plant chromosome that contains adjacent regions of 
rDNA and heterochromatin include steps of introducing a first nucleic acid 
containing a site-specific recombination site into a first plant chromosome, 
5 introducing a second nucleic acid containing a site-specific recombination 
site into a second plant chromosome, recombining nucleic acids of the first 
and second chromosomes and selecting a plant chromosome that is 
acrocentric or that contains adjacent regions of rDNA and heterochromatin. 
For example, to produce an acrocentric plant chromosome, the first nucleic 
10 acid can be introduced into or adjacent to the pericentric heterochromatin of 
the first chromosome and/or the second nucleic acid can be introduced into 
the distal end of the arm of the second chromosome. To produce an 
acrocentric plant chromosome containing adjacent regions of rDNA and 
heterochromatin, for example, the first nucleic acid can be introduced into or 
15 adjacent the pericentric heterochromatin on the short arm of an acrocentric 
plant chromosome and the second nucleic acid can be introduced into or 
adjacent to rDNA. To produce a plant chromosome containing adjacent 
regions of rDNA and heterochromatin, for example, the first nucleic acid can 
be introduced into or adjacent to heterochromatin, such as pericentric 
20 heterochromatin or satellite DNA, and the second nucleic acid can be 

introduced into or adjacent to rDNA. When the chromosomes are located 
within a cell, the method can include selecting a cell containing a plant 
chromosome that is acrocentric and/or that contains adjacent regions of 
rDNA and heterochromatin. 
25 Another embodiment of the methods of producing an acrocentric plant 

chromosome includes steps of introducing a first nucleic acid containing a 
site-specific recombination site into the pericentric heterochromatin of a plant 
chromosome, introducing a second nucleic acid containing a site-specific 
recombination site into the distal end of the chromosome in which the first 
30 and second recombination sites are located on the same arm of the 
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chromosome, recombining nucleic acids of the first and second 
recombination sites in the chromosome and selecting a plant chromosome 
that is acrocentric. 

Another method of producing an acrocentric plant chromosome or a 
5 plant chromosome containing adjacent regions of rDNA and heterochromatin 
includes steps of introducing nucleic acid containing a recombination site 
adjacent to or sufficiently near nucleic acid encoding a selectable marker into 
a first plant cell for recombination and introduction of the marker into the 
chromosome, generating a first transgenic plant from the first plant cell, 

10 introducing nucleic acid containing a promoter functional in a plant cell and a 
recombination site in operative linkage into a second plant cell, generating a 
second transgenic plant from the second plant cell, crossing the first and 
second plants, obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and selecting a 

15 resistant plant that contains cells containing an acrocentric plant 

chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin. .Methods of this embodiment can optionally include 
steps of selecting first and second transgenic plants such that one of the 
plants contains a chromosome containing a recombination site in a region 

20 within or adjacent to the pericentric heterochromatin and the other plant 
contains a chromosome containing a recombination site located within or 
adjacent to rDNA of the chromosome. These methods can further include 
the steps of selecting first and second transgenic plants where one of the 
plants contains a chromosome containing a recombination site located on a 

25 short arm of the chromosome in a region adjacent to the pericentric 
heterochromatin; and 

the other plant contains a chromosome containing a recombination site 
located in rDNA of the chromosome. In one embodiment, the recombination 
sites on the two chromosomes are in the same orientation. 
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ln methods of producing an acrocentric plant chromosome, one or 
both of these recombination sites is located on a short arm of the 
chromosome. For example, one of the one of the plants contains a 
chromosome containing a recombination site in region within or adjacent to 
5 the pericentric heterochromatin located on the short arm of the chromosome. 
The selecting steps can further include selecting first and second transgenic 
plants such that the recombination sites on the two chromosomes are in the 
same orientation. 

In any of these methods of producing an acrocentric plant 

10 chromosome or a plant chromosome containing adjacent regions of rDNA 
and heterochromatin (in particular, pericentric heterochromatin and/or 
satellite DNA), recombination between the first and second site-specific 
recombination sites can be provided for in a number of ways. For example, a 
recombinase activity can be introduced into a cell containing one or more 

15 chromosomes containing the sites which catalyzes the recombination 

reaction. The recombinase activity can be encoded by nucleic acid that is 
introduced into the cell simultaneously with nucleic acid containing a site- 
specific recombination site or that is introduced into the cell at a different 
time. Recombinase activity occurs within the cell upon expression of the 

20 nucleic acid encoding a recombinase activity, which can be operatively linked 
to a promoter functional in the cell. The recombinase activity can be 
constitutively expressed or can be induced, for example, by linking the 
nucleic acid encoding the recombinase to an inducible promoter. It is also 
possible that a cell into which nucleic acid containing site-specific 

25 recombination sites is introduced contains a recombinase enzyme which can 
be constitutively or inductbly expressed. Alternatively, a transgenic plant can 
be generated from cells containing the recombination sites and crossed with 
a transgenic plant containing nucleic acid encoding a recombinase. 

Any site-specific recombinase system known to those of skill in the 

30 art is contemplated for use herein. It is contemplated that one or a plurality 
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of sites that direct the recombination by the recombinase are introduced into 
the ACes (or other ACs) and then heterologous genes linked to the cognate 
site are introduced into an ACes to produce platform ACes. The resulting 
ACes are introduced into cells with nucleic acid encoding the cognate 
5 recombinase, typically on a vector, and nucleic acid encoding heterologous 
nucleic acid of interest linked to the appropriate recombination site for 
insertion into the ACes chromosome. The recombinase encoding nucleic 
acid may be introduced into the AC, includes ACes, or on the same or a 
difference vector from the heterologous nucleic acid. 

lO For the methods herein any recombinase enzyme that catalyzes site- 

specific recombination can be used to facilitate recombination between the 
first and second site-specific recombination sites. A variety of recombinases 
and attachment/recombination sites therefor are available and/or known to 
those of skill in the art. These include, but not limited to: the Cre/fox 

15 recombination system using CRE recombinase from the Escherichia coli 

phage P1 , the FLP/FRT system of yeast using the FLP recombinase from the 
2jj episome of Saccharomyces cerevisiae, the resolvases, including Gin 
recombinase of phage Mu, Cin, Hin, oS Tn3; the Pin recombinase of E. coli, 
the R/RS system of the pSRl plasmid of Zygosaccharomyces rouxii site 

20 specific recombinases from Kluyveromyces drosophilarium and 
Kluyveromyces waltii and other systems are 

Also contempalted is the E. co/f phage lambda integrase system, the phage 
lambda integrase and the cognate att sites (see, also copending application 
U.S. application Serial No. (attorney docket No. 24601 -420, filed on the 

25 same day herewith)). 

In any of these methods of producing acrocentric plant chromosomes, 
nucleic acid containing a site-specific recombination site can also contain 
nucleic acid encoding a selectable marker. The nucleic acids used in the 
methods can be designed such that expression of the selectable marker 

30 occurs only upon the desired recombination event. 
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Acrocentric plant chromosomes produced by the methods provided 
herein can be of any composition. For example, the DNA of the short arm of 
the acrocentric chromosome can contain less than 5% or less than 1 % 
euchromatic DNA or can contain no euchromatic DNA. Acrocentric plant 
5 artificial chromosomes in which the short arm of the acrocentric chromosome 
does not contain euchromatic DNA are provided. 

In another embodiment, a method of producing a plant artificial 
chromosome, that includes the steps of introducing nucleic acid into a plant 
cell acrocentric chromosome in which the short arm does not contain 
10 euchromatic DNA; culturing the cell through at least one cell division; and 
selecting a cell containing an artificial chromosome, such as one that is 
predominantly heterochromatic, is provided. The acrocentric chromosome is 
produced by the method of any the methods described herein or other 
suitable methods. 

15 in another embodiment, a method for producing an artificial 

chromosome, that includes the steps of introducing nucleic acid into a plant 
cell; and 

selecting a plant cell that includes an artificial chromosome that contains one 
or more repeat regions is provided. In this AC, one or more nucleic acid 

20 units is (are) repeated in a repeat region; repeats of a nucleic acid unit have 
common nucleic acid sequences; and the common sequences of 
nucleotides include sequences that represent euchromatic and 
heterochromatic nucleic acid. The nucleic acid can include plant rDNA from 
a dicot plant species or plant rDNA from a monocot plant species. The 

25 intergenic spacer region can be from DNA from a Nicotiana plant or other 
suitable source of such DNA. The rDNA can be plant rDNA, and the plant 
can be a dicot or a monocot. 

Also provided are isolated plant artificial chromosomes that contain 
one or more repeat regions. In these ACs one or more nucleic acid units is 

30 (are) repeated in a repeat region; repeats of a nucleic acid unit have common 
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nucleic acid sequences; and the common sequences of nucleotides include 
sequences that represent euchromatic and heterochromatic nucleic acid. The 
artificial chromosome can be produced by a method that includes the steps 
of: introducing nucleic acid into a plant cell; and selecting a plant cell 
5 containing an artificial chromosome that contains one or more repeat regions. 
The repeats of a nucleic acid unit have common nucleic acid sequences; and 
the common nucleic acid sequences contain sequences that represent 
euchromatic and heterochromatic nucleic acid. 

In another embodiment, another method for producing an acrocentric 

10 plant chromosome is provided. The method includes the steps of: 

introducing nucleic acid containing two site-specific recombination sites into 
a cell containing one or more plant chromosomes; introducing into the cell a 
recombinase activity that catalyzes recombination between the two 
recombination sites to produce a plant acrocentric chromosome. In the 

15 embodiment, the two site-specific recombination sites can be on separate 
nucleic acid fragments, which optionally can be introduced into the cell 
simultaneously or sequentially. The resulting artificial chromosome can be 
one that is predominantly heterochromatic. 

In another embodiment, a method of producing a plant artificial 

20 chromosome is provided. The method includes the steps of: introducing 
nucleic acid into a plant chromosome, such as but not limited to, an 
acrocentric chromosome, in a cell that contains adjacent regions of rDNA and 
heterochromatic DNA; culturing the cell through at least one cell division; 
and selecting a cell containing an artificial chromosome. The resulting 

25 artificial chromosome can be predominantly heterochromatic. The 

acrocentric chromosome can be one where the short arm of the chromosome 
contains adjacent regions of rDNA and heterochromatic DNA, such as, but 
not limited to, pericentric heterochromatin. 

Also provided are a variety of vectors. Among these are vectors 

30 containing nucleic acid encoding a selectable marker that is not operably 
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associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
5 a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome. Exemplary of such vectors is pAglla and pAgllb. 

Another vector provided herein contains nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, wherein 
the selectable marker permits growth of animal cells in the presence of an 
10 agent normally toxic to the animal cells; and wherein the agent is not toxic to 
plant cells; a recognition site for recombination; and nucleic acid encoding a 
protein operably linked to a plant promoter. Exemplary of these vectors is 
pAgl and pAg2. 

Another vector that is provided contains: nucleic acid encoding a 

15 selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells but not toxic to animal cells; a 
recognition site for recombination; and nucleic acid encoding a protein 
operably linked to a plant promoter. 

20 Another vector is a plant transformation vector that contains nucleic 

acid encoding a recognition site for recombination; a sequence of nucleotides 
that facilitates or causes amplification of a region of a plant chromosome; 
one or more selectable markers that are expressed in plant cells to permit the 
selection of cells containing the vector, and Agrobacterium nucleic acid. The 

25 vector is for Agrobacterium-medlated transformation of plants. 

Another vector that is provided contains a recognition site for 
recombination; and a sequence of nucleotides that facilitates amplification of 
a region of a plant chromosome or targets the vector to an amplifiable region 
of a plant chromosome, wherein the plant is selected from the group 

30 consisting of Arabidopsis, Nicotiana, Solanum. Lycopersicon, Daucus, 
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Hordeum, Zea mays, Brassica, Triticum, Helianthus, soybean, cotton and 
Oryza. 

In these vectors, the amplifiable region can contain heterochromatic 
nucleic acid; the amplifiable region can contain rDNA. Exemplary sequences 
5 of nucleotides that facilitates amplification of a region of a plant chromosome 
or targets the vector to an amplifiable region of a plant chromosome are any 
that contain a sufficient portion of an intergenic spacer region of rDNA to 
facilitate amplification or effect the targeting. Such sufficient portion can be 
at least 14, 20, 30, 50, 100, 150, 300, 500, 1 kB, 2 kB, 3 kB, 5 kB, 10 kB 

10 or more contiguous nucleotides from an intergenic spacer region and/or other 
rDNA region. An exemplary selectable marker encodes a product confers 
resistance to zeomycin. The protein in the vectors include a protein that is a 
selectable marker that permits growth of plant cells in the presence of an 
agent normally toxic to the plant cells, such as, for example, resistance to 

15 hygromycin or to phosphothricin. Other such protein markers include, but 
are not limited to, fluorescent proteins, such as, for example, green, blue 
and red fluorescent proteins. An exemplary recognition site contains an att 
site. Exemplary promoters for inclusion in the vectors, include, but are not 
limited to, nopaline synthase (NOS) or CaMV35S. 

20 Cell, containing any of the vectors or mixtures thereof are provided. 

The cells include any cells that have at least one plant chromosome, such as 
a plant cell. The cells can be protoplasts. 

Methods using these vectors are provided. The methods includes a 
step of introducing one of the vectors into a cell, such as a cell that 

25 contains at least one plant chromosome. Such vector is for example, a 
vector that contains nucleic acid encoding a selectable marker that is not 
operably associated with any promoter, where the selectable marker permits 
growth of animal cells in the presence of an agent normally toxic to the 
animal cells but is not toxic to plant cells; a recognition site for 

30 recombination; and 
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nucteic acid encoding a protein operably linked to a plant promoter. In this 
method, the cell contains an animal, such as a mammal, platform ACes that 
contains a recognition site, such as, for example, an ett site, that recombines 
with the recognition site in the vector in the presences of the recombinase 
5 therefor, thereby incorporating the selectable marker that is not operably 
associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. The platform ACes can contain a promoter that, 
upon recombination, is operably linked to the selectable marker that in the 

10 vector is not operably associated with a promoter. The method can further 
include transferring the resulting platform ACes into a plant cell to produce a 
plant ceil that contains the platform Aces. The method optionally further 
includes culturing the plant cell that contains the platform Aces under 
conditions whereby the protein encoded by the nucleic acid that is operably 

15 linked to a plant promoter is expressed. 

The resulting platform ACes optionally is isolated prior to transfer. 
The Aces can be introduced into a plant celt by any suitable method, such as 
one selected from among protoplast transfection, lipid-mediated delivery, 
liposomes, electroporation, sonoporation, microinjection, particle 

20 bombardment, silicon carbide whisker-mediated transformation, polyethylene 
glycol (PEG)-mediated DNA uptake, Jipofection and lipid-mediated carrier 
systems. The resulting platform ACes can be transferred by fusion of the 
cells, which, for example, are plant protoplasts. In another embodiment, the 
cell can be an animal cell, such as a mammalian, including human, cell. 

25 

In another, method a vector is introduced into plant cells. Such 
vector, for example, can be a vector that includes nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
30 agent normally toxic to the animal cells but is not toxic to plant cells; a 
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recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplifiable region of a plant chromosome. The plant cells are 
cultured and a plant cell(s) containing an artificial chromosome that contains 
5 one or more repeat regions is selected. In this method, a sufficient portion of 
the vector can integrates into a chromosome in the plant cell to result in 
amplification of chromosomal DNA. The resulting selected artificial 
chromosome can be on in which one or more nucleic acid units is (are) 
repeated in a repeat region; repeats of a nucleic acid unit have common 
10 nucleic acid sequences; and the repeat region(s) contain substantially 

equivalent amounts of euchromatic and heterochromatic nucleic acid. The 
resulting artificial chromosome produced in the method optionally can be 
isolated. 

Anther method is also provided. This method includes the steps of 

15 introducing a vector into a cell, and culturing the resulting cell under 

conditions, whereby the protein encoded by nucleic acid operably linked to 
an animal promoter is expressed. In the method the vector can contains: 
nucleic acid encoding a selectable marker that is not operably associated 
with any promoter, where the selectable marker permits growth of animal 

20 cells in the presence of an agent normally toxic to the animal cells but is not 
toxic to plant cells; a recognition site for recombination; and nucleic acid 
encoding a protein operably linked to an animal promoter. The cell can 
contain a platform plant artificial chromosome (PAC) that contains a 
recombination site and an animal promoter that upon recombination is 

25 operably linked to the selectable marker that in the vector is not operably 

associated with a promoter. Introduction can be effected under conditions 
whereby the vector recombines with the PAC to produce a plant platform 
PAC that contains the selectable marker operably linked to the promoter. In 
this method, the artificial chromosome can be an ACes. In addition, the 

30 plant platform PAC can be an ACes. 
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The vectors, such as those that contain nucleic acid encoding a 
selectable marker that is not operably associated with any promoter, where 
the selectable marker permits growth of animal cells in the presence of an 
agent normally toxic to the animal cells but is not toxic to plant cells; a 
5 recognition site for recombination; and a sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the 
vector to an amplif iable region of a plant chromosome, and the plant 
transformation vectors that contain nucleic acid for Agrobacter/um-med\ated 
transformation of plants, can be used to produce artificial chromosomes. In 
10 one exemplary method, such vector is introduced into a cell containing one 
or more plant chromosomes; and 

a cell containing an artificial chromosome that contains one or more repeat 
regions is selected. The artificial chromosome contains one or more nucleic 
acid units that is (are) repeated in a repeat region; the repeats of a nucleic 

15 acid unit have common nucleic acid sequences; and the common nucleic acid 
sequences contain sequences that represent euchromatic and 
heterochromatic nucleic acid. In another method, a cell containing an 
artificial chromosome that contains one or more repeat regions is selected. 
The artificial chromosome contains one or more nucleic units that is (are) 

20 repeated in a repeat region; repeats of a nucleic acid unit have common 
nucleic acid sequences; and 

the repeat region(s) contain substantially equivalent amounts of euchromatic 
and heterochromatic nucleic acid. 
DESCRIPTION OF THE DRAWINGS 
25 Figure 1 provides a map of plasmid pAgl. 

Figure 2 provides a schematic representation of the construction of 
plasmid pAgl . 

Figure 3 provides a map of plasmid pAg2. 

Figure 4 provides a schematic representation of the construction of 
30 plasmid pAg2. 



# 
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Figure 5 provides a schematic representation of the construction of 
plasmids pAglla and pAgllb. 

Figure 6A-6B provide restriction maps of the DNA inserted into pAg1 
to form plasmids pAglla and pAgllb. 
5 Figure 7 provides a map of plasmid pSV401 93attPsensePUR. 

Figure 8 depicts a method for formation of a chromosome platform 
with multiple recombination integration sites, such as attP sites. 

Figure 9 diagrammatically summarizes the platform technology; 
marker 1 permits selection of the artificial chromosomes containing the 
10 integration site; marker 2, which is promoterless in the donor vector permits 
selection of recombinants. Upon recombination with the platform marker 2 
is expressed under the control of a promoter resident on the platform. 
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

Definitions 

15 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is commonly understood by one of skill in the art 
to which this invention belongs. All patents, patent applications, published 
applications and other publications and published nucleotide and amino acid 
sequences (e.g., sequences available in GenBank or other databases) referred 

20 to herein are incorporated by reference in their entirety. Where reference is 
made to a URL or other such identifier or address, it is understood that such 
identifiers can change and particular information on the internet can come 
and go, but equivalent information can be found by searching the internet. 
Reference thereto evidences the availability and public dissemination of such 

25 information. 

As used herein, a chromosome is a defined composition of nucleic 
acid that is capable of replication and segregation within a cell upon cell 
division. Typically, a chromosome may contain a centromeric region, 
telomeric regions and a region of nucleic acid between the centromeric and 

30 telomeric regions. 
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As used herein, a centromere is a molecular composition that includes 
a nucleic acid sequence that confers an ability to segregate to daughter cells 
through cell division. A centromere may confer stable segregation of a 
nucleic acid sequence, including an artificial chromosome containing the 
5 centromere, through mitotic and/or meiotic divisions. A plant centromere is 
not necessarily derived from plants, but has the ability to promote DNA 
segregation in plant cells. 

As used herein, euchromatin and heterochromatin have their 
recognized meanings. Euchromatin refers to chromatin that stains diffusely 

10 and that typically contains genes, and heterochromatin refers to chromatin 
that remains unusually condensed and that has been thought to be 
transcriptionally inactive or has low transcriptional activity relative to 
euchromatin. Highly repetitive DNA sequences (satellite DNA) are usually 
located in regions of the heterochromatin surrounding the centromere 

15 (pericentric or pericentromeric heterochromatin). Constitutive 

heterochromatin refers to heterochromatin that contains the highly repetitive 
DNA which is constitutively condensed and genetically inactive. 

As used herein, an acrocentric chromosome refers to a chromosome 
with arms of unequal length. 

20 As used herein, endogenous chromosomes refer to genomic chromo- 

somes as found in the cell prior to generation or introduction of an artificial 
chromosome. 

As used herein, artificial chromosomes are nucleic acid molecules, 
typically DNA, that stably replicate and segregate alongside endogenous 

25 chromosomes in cells and have the capacity to accommodate and express 
heterologous genes contained therein. A mammalian artificial chromosome 
(MAC) refers to a chromosome that has an active mammalian centromere(s). 
Plant artificial chromosomes (PAC), insect artificial chromosomes and avian 
artificial chromosomes refer to chromosomes that include centromeres that 

30 function in plant, insect and avian cells, respe ctively. Human artificial 
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chromosomes (HAC) refers to chromosomes that include centromeres that 
function in human cells. For exemplary artificial chromosomes, see, e.g. , 
U.S. Patent Nos. 6,025,155; 6,077,697; 5,288,625; 5,712,134; 
5,695,967; 5,869,294; 5,891,691 and 5,721,118 and published 
5 International PCT application Nos, WO 97/40183 and WO 98/08964. 

As used herein, amplification, with reference to DNA, is a process in 
which segments of DNA are duplicated to yield two or multiple copies of 
substantially similar or identical or nearly identical DNA segments that are 
typically joined as substantially tandem or successive repeats or inverted 
10 repeats. 

As used herein, amplification-based artificial chromosomes are 
artificial chromosomes derived from natural or endogenous chromosomes by 
virtue of an amplification event, such as one that may be initiated by 
introduction of heterologous nucleic acid into heterochromatin, for example, 

15 pericentric heterochromatin, in a chromosome. As a result of such an event, 
chromosomes and/or fragments thereof exhibiting segmented or repeating 
patterns arise. Artificial chromosomes can be formed from these 
chromosomes and fragments. Hence, amplification-based artificial 
chromosomes refer to non-natural or isolated chromosomes that exhibit an 

20 ordered segmentation that is not typically observed in naturally occurring 

chromosomes and that can be a basis for distinguishing them from naturally 
occurring chromosomes. Amplification-based artificial chromosomes can 
also be distinguished from naturally occurring chromosomes by virtue of their 
typically smaller size and often segmented appearance when visualized. The 

25 segmented appearance, which can be visualized using a variety of 

chromosome analysis techniques as described herein and known to those of 
skill in the art, correlates with the unique structure of these artificial 
chromosomes. In addition to containing one or more centromeres, the 
amplification-based artificial chromosomes, throughout the region or regions 

30 of segmentation, are predominantly made up of one or more nucleic acid 
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units, also referred to as "amplicons", that is (are) repeated in the region and 
that have a similar gross structure. Thus, a region of segmentation may be 
referred to as a repeat region. Repeats of an amplicon tend to be of similar 
size and share some common nucleic acid sequences. For example, each 
5 repeat of an amplicon may contain a replication site involved in amplification 
of chromosome segments and/or some heterologous nucleic acid that was 
utilized in the initial production of the artificial chromosome. Typically, the 
repeating units are substantially similar in nucleic acid composition and may 
be nearly identical. The common nucleic acid sequences may contain 
lO sequences that represent euchromatic and heterochromatic nucleic acid. 
Amplicon sizes vary but typically tend to be greater than about 10O kb, 
greater than about 500 kb, greater than about 1 Mb, greater than about 5 
Mb or greater than about 10 Mb. The composition of the amplification-based 
artificial chromosomes may be such that substantially the entire chromosome 

15 exhibits a segmented appearance or such that only one or more portions that 
make-up less than the entire chromosome appear segmented. The 
amplification-based artificial chromosomes can also differ depending on the 
chromosomal region that has undergone amplification in the process of 
artificial chromosome formation. The structures of the resulting 

20 chromosomes can vary depending upon the initiating event and/or the 

conditions under which the heterologous nucleic acid is introduced, including 
modification to the endogenous chromosomes. For example, in some of the 
artificial chromosomes provided herein, the region or regions of segmentation 
may be made up predominantly of heterochromatic DNA. In other artificial 

25 chromosomes provided herein, the region or regions of segmentation may be 
made up predominantly of euchromatic DNA or may be made up of similar 
amounts of heterochromatic and euchromatic DNA. The region or regions of 
segmentation thus may be entirely heterochromatic (while still containing one 
or more heterologous nucleic acid sequences), or may contain increasing 

30 amounts of euchromatic DNA, such that, for example, the region contains 
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about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA. Because the entire artificial chromosome can be 
made up predominantly of a region or regions of segmentation, it is thus 
possible for the artificial chromosome to be made up predominantly of 
5 heterochromatin or euchromatin, or to be made up of substantially equivalent 
amounts of heterochromatin and euchromatin, e.g. , about 4-0% to about 
50% of one type of nucleic acid and about 50% to about 60% of the other 
type of nucleic acid. 

As used herein the term "predominantly" with respect to a 

10 composition generally refers to a state of the composition in which it can be 
characterized as being or having more of the predominant feature than other 
features which are not predominant. The predominant feature may represent 
more than about 50%, more than about 60%, more than about 70%, more 
than about 80%, more than about 90%, more than about 95% or essentially 

15 100% of the composition. Thus, for example, a repeat region that is 
predominantly made up of heterochromatic DNA contains more 
heterochromatic DNA than other types, e.g., euchromatic, of DNA. The 
repeat region may be more than about 50%, more than about 60%, more 
than about 70%, more than about 80%, more than about 90% or more than 

20 about 95% heterochromatic DNA or may be essentially 100% 

heterochromatic DNA. An artificial chromosome predominantly made up of 
heterochromatin contains more heterochromatic DNA than other types, e.g. , 
euchromatic, of DNA and may be more than about 50%, more than about 
60%, more than about 70%, more than about 80%, more than about 90% 

25 or more than about 95% heterochromatic DNA or may be essentially 100% 
heterochromatic DNA. 

As used herein an amplicon is a repeated nucleic acid unit. In some of 
the artificial chromosomes described herein, an amplicon may contain a set 
of inverted repeats of a megareplicon. A megareplicon represents a higher 

30 order replication unit. For example, with reference to some of the 
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predominantly heterochromatic artificial chromosomes, particularly eukaryotic 
chromosomes, described herein, the megareplicon may contain a set of 
tandem DNA blocks (e.g., — 7.5 Mb DNA blocks) each containing satellite 
DNA flanked by non-satellite DNA or may substantially be made up of rDNA. 
5 Contained within the megareplicon is a primary replication site, referred to as 
the megareplicator, which may be involved in organizing and facilitating 
replication of segments of chromosomes, including, for example, 
heterochromatin, pericentric heterochromatin, rDNA and/or possibly the 
centromeres. Within the megareplicon there may be smaller (e.g., 50-300 

10 kb) secondary replicons. As used herein, arnplif iable, when used in 

reference to a chromosome, particularly the method of generating artificial 
chromosomes provided herein, refers to a region of a chromosome that is 
prone to amplification. Amplification typically occurs during replication and 
other cellular events involving recombination (e.g., DNA repair). Included 

15 among such regions are regions of the chromosome that contain tandem 
repeats, such as satellite DNA, rDNA, and other such sequences. 

Among the artificial chromosome systems provided herein are those 
that are predominantly heterochromatic [formerly referred to as satellite 
artificial chromosomes (SATACs); see, e.g., U.S. Patent Nos. 6,077,697 

20 and 6,025,155 and published International PCT application No. 

WO 97/40183], minichromosomes which contain a de novo centromere, 
artificial chromosomes containing one or more regions of repeating nucleic 
acid units wherein the repeat region(s) contain substantially equivalent 
amounts of euchromatic and heterochromatic nucleic acid and in vitro 

25 assembled artificial chromosomes. Of particular interest herein are artificial 
chromosomes that introduce and express heterologous nucleic acids in 
plants. These include artificial chromosomes that have a centromere derived 
from a plant, and, also, artificial chromosomes that have centromeres that 
may be derived from other organisms but that function in plants. Methods 
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for the construction, isolation, and delivery to target cells of each type of 
artificial chromosome are provided herein. 

As used herein, to target nucleic acid to a locus on a chromosome 
means that the nucleic acid integrates at or near the targeted locus. Any 
5 method or means for effecting such integration, including, but not limited to, 
homologous recombination, is contemplated. 

As used herein, a dicentric chromosome is a chromosome that 
contains two centromeres. A multicentric chromosome contains more than 
two centromeres. 

lO As used herein, a formerly dicentric chromosome is a chromosome 

that is produced when a dicentric chromosome fragments and acquires new 
telomeres so that two chromosomes, each having one of the centromeres, 
are produced. Each of the fragments are replicable chromosomes. If one of 
the chromosomes undergoes amplification of primarily euchromatic DNA to 

15 produce a fully functional chromosome that is predominantly (more than 
about 50%, more than about 70% or more than about 90% euchromatin) 
euchromatin, it is a minichromosome. The remaining chromosome is a 
formerly dicentric chromosome. If one of the chromosomes undergoes 
amplification, whereby heterochromatin (such as, for example, satellite DNA) 

20 is amplified and a euchromatic portion (such as, for example, an arm) 

remains, it is referred to as a sausage chromosome. A chromosome that is 
substantially all heterochromatin, except for portions of heterologous DNA, is 
called a predominantly heterochromatic artificial chromosome. Predominantly 
heterochromatic artificial chromosomes can be produced from other partially 

25 heterochromatic artificial chromosomes by culturing the cell containing such 
chromosomes under conditions that destabilize the chromosome and/or under 
selective conditions so that a predominantly heterochromatic artificial 
chromosome is produced. For purposes herein, it is understood that the 
artificial chromosomes may not necessarily be produced in multiple steps, 

30 but may appear after the initial introduction of the heterologous DNA. 
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Typically, artificial chromosomes appear after about 5 to about 60, or about 
5 to about 55, or about 10 to about 55 or about 25 to about 55 or about 35 
to about 55 cell divisions following introduction of nucleic acid into a cell. 
Artificial chromosomes may, however, appear after only about 5 to about 1 5 
5 or about 10 to about 15 cell divisions. 

As used herein, the term "satellite DNA-based artificial chromosome 
(SATAC)" is interchangable with the term "artificial chromosome expression 
system (ACes) M . These artificial chromosomes (ACes) include those that are 
substantially all neutral non-coding sequences (heterochromatin) except for 

10 foreign heterologous, typically gene or protein-encoding, nucleic acid, that 
may be interspersed within the heterochromatin for the expression therein 
(see U.S. Patent Nos. 6,025,155 and 6,077,697 and International PCT 
application No. WO 97/40183), or that is in a single locus as provided 
herein. The delineating structural feature is the presence of repeating units, 

15 which are generally predominantly heterochromatin. The precise structure of 
the ACes will depend upon the structure of the chromosome in which the 
initial amplification event occurs; all share the common feature of including a 
defined pattern of repeating units. Generally ACes have more 
heterochromatin than euchromatin. Foreign nucleic acid molecules 

20 (heterologous genes) contained in these artificial chromosome expression 
systems can include any nucleic acid whose expression is of interest in a 
particular host cell. 

As used herein, an artificial chromosome that is predominantly 
heterochromatic (i.e., containing more heterochromatin than euchromatin, 

25 typically more than about 50%, more than about 60%, more than about 

70%, more than about 80% or more than about 90% heterochromatin) may 
be produced by introducing nucleic acid molecules into cells, particularly 
plant cells, and selecting cells that contain a predominantly heterochromatic 
artificial chromosome. Any nucleic acid may be introduced into cells in the 

30 methods of producing the artificial chromosomes. For example, the nucleic 
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acid may contain a selectable marker and/or a sequence that targets nucleic 
acid to a heterochromatic region of a chromosome, particularly a plant 
chromosome, such as in the pericentric heterochromatin, in the short arm of 
acrocentric chromosomes, rDNA or nucleolar organizing regions. Targeting 
5 sequences include, but are not limited to, lambda phage DNA and rDNA 

(e.g. , a sequence of an intergenic spacer of rDNA), particularly plant rDNA, 
for production of predominantly heterochromatic artificial chromosomes in 
plant cells. 

After introducing the nucleic acid into cells, a cell containing a 

10 predominantly heterochromatic artificial chromosome is selected. Such cells 
may be identified using a variety of procedures. For example, repeating units 
of heterochromatic DNA of these chromosomes may be discerned by G- 
and/or C-banding and/or fluorescence in situ hybridization (FISH) techniques. 
Prior to such analyses, the cells to be analyzed may be enriched with 

1 5 artificial chromosome-containing cells by sorting the cells on the basis of the 
presence of a selectable marker, such as a reporter protein, or by growing 
(culturing) the cells under selective conditions. Selection of cells containing 
amplified nucleic acids may also be facilitated by use of techniques such as 
PCR and Southern blotting to identify cell lines with amplified regions. It is 

20 also possible, after introduction of nucleic acids into cells, to select cells that 
have a multicentric, typically dicentric, chromosome, a formerly multicentric 
(typically dicentric) chromosome and/or various heterochromatic structures 
and to treat them such that desired artificial chromosomes are produced. 
Conditions for generation of a desired structure include, but are not limited 

25 to, further growth under selective conditions, introduction of additional 
nucleic acid molecules and/or growth under selective conditions and 
treatment with destabilizing agents, and other such methods (see 
International PCT application No. WO 97/40183 and U.S. Patent Nos. 
6,025,155 and 6,077,697). 
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As used herein, heterologous and foreign are used interchangeably 
with respect to nucleic acid and refer to any nucleic acid, including DNA and 
RNA, that does not occur naturally as part of the genome in which it is 
present or which is found in a location or locations in the genome that differ 
5 from that in which it occurs in nature. Thus, heterologous or foreign nucleic 
acid that is not normally found in the host genome in an identical context. It 
is nucleic acid that is not endogenous to the cell and has been exogenously 
introduced into the cell. Examples of heterologous DNA include, but are not 
limited to, DNA that encodes a gene product or gene product(s) of interest, 

10 introduced for purposes of modification of the endogenous genes or for 

production of an encoded protein. For example, a heterologous or foreign 
gene may be isolated from a different species than that of the host genome, 
or alternatively, may be isolated from the host genome but operably linked to 
one or more regulatory regions which differ from those found in the 

15 unaltered, native gene. Other examples of heterologous DNA include, but 
are not limited to, DNA that encodes traceable marker proteins, and DNA 
that encodes a protein that confers an input trait including, but not limited to, 
herbicide, insect, or disease resistance or an output trait, including, but not 
limited to, oil quality or carbohydrate composition. Antibodies that are 

20 encoded by heterologous DNA may be secreted, sequestered, stored in an 
organ or tissue, accumulate in the cytoplasm or cellular organelles or 
expressed on the surface of the cell in which the heterologous DNA has been 
introduced. 

As used herein, a "selectable marker" is a composition that can be 
25 used to distinguish one cell from another cell. For example, a selectable 
marker may be a nucleic acid encoding a readily detected protein that has 
been introduced into some cells but not others. Detection of the expressed 
protein in cells facilitates identification of cells containing the marker nucleic 
acid by distinguishing them from cells that do not contain the nucleic acid. 
30 Thus, for example, a selectable marker may be a fluorescent protein, such as 
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green fluorescent protein (GFP), or /?-galactosidase (or a nucleic acid 
encoding either of these proteins). Selectable markers such as these, which 
are not required for cell survival and/or proliferation in the presence of a 
selection agent, may also be referred to as reporter molecules. Other 
5 selectable markers, e.g., the neomycin phosphotransferase gene, provide for 
isolation and identification of cells containing them by conferring properties 
on the cells that make them resistant to an agent, e.g. , a drug such as an 
antibiotic, that inhibits proliferation of cells that do not contain the marker. 

As used herein, growth under selective conditions means growth of a 
10 cell under conditions that require expression of a selectable marker for 
survival. 

As used herein, an agent that destabilizes a chromosome is any agent 
known by those of skill in the art to enhance amplification events, and/or 
mutations. Such agents, which include BrdU, are well known to those of 

1 5 skill in the art. 

In order to generate an artificial chromosome containing a particular 
heterologous nucleic acid of interest, it is possible to include the nucleic acid 
of interest in the nucleic acid that is being introduced into cells to initiate 
production of the artificial chromosome. Thus, for example, a nucleic acid of 

20 interest could be introduced into a cell along with nucleic acid encoding a 
selectable marker and/or a nucleic acid that targets to a heterochromatic 
region of a chromosome. For example, the nucleic acid of interest can be 
linked to targeting nucleic acid(s). Alternatively, heterologous nucleic acid of 
interest can be introduced into an artificial chromosome at a later time after 

25 the initial generation of the artificial chromosome. 

As used herein, the minichromosome refers to a chromosome derived 
from a multicentric, typically dicentric, chromosome that contains more 
euchromatic than heterochromatic DNA. For purposes herein, the 
minichromosome contains a de novo centromere, preferably a centromere 

30 that replicates in plants, more preferably a plant centromere. 
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As used herein, de novo with reference to a centromere, refers to 
generation of an excess centromere in a chromosome as a result of 
incorporation of a heterologous nucleic acid fragment using the methods 
herein. 

5 As used herein, in vitro assembled artificial chromosomes or synthetic 

chromosomes are artificial chromosomes produced by joining essential 
components of a chromosome in vitro. These components include at least a 
centromere, a telomere and an origin of replication. An in vitro assembled 
artificial chromosome may include one or more megareplicators. In particular 
10 embodiments, the megareplicator contains sequences of rDNA, particularly 
plant rDNA. 

As used herein, in vitro assembled plant artificial chromosomes are 
produced by joining components (e.g., the centromere, telomere(s) 
megareplicator and an origin of replication) that function in plants, and 

15 preferably, one or more of which is derived from a plant, in vitro assembled 
artificial chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
chromosome may be substantially all heterochromatin, or may contain 
increasing amounts of euchromatic DNA, such that, for example, it contains 

20 about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
about 90% euchromatic DNA. in vitro assembled artificial chromosomes 
may contain one or more regions of segmentation as described with 
reference to amplification-based artificial chromosomes. 

As used herein, an artificial chromosome platform refers to an artificial 

25 chromosome that has been engineered to include one or more sites for site 
specific recombination-directed integration. Included within the artificial 
chromosome platforms are ACes, particularly plant ACes, that are so- 
engineered. Any sites, including but not limited to any described herein, that 
are suitable for such integration are contemplated. Among the ACes 

30 contemplated herein are those that are predominantly heterochromatic 
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(formerly referred to as satellite artificial chromosomes (SATACs); see, e.g., 
U.S. Patent Nos. 6,077,697 and 6,025,155 and published International PCT 
application No. WO 97/40183), artificial chromosomes predominantly made 
up of repeating nucleic acid units and that contain substantially equivalent 
5 amounts of euchromatic and heterochromatic DNA or wherein the repeat 
regions of the chromosomes contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. Included among the ACes for 
use in generating platforms are artificial chromosomes that introduce and 
express heterologous nucleic acids in plants as described herein. These 

lO include artificial chromosomes that have a centromere derived from a plant, 
and, also, artificial chromosomes that have centromeres that may be derived 
from other organisms but that function in plants. 

As used herein, recognition sequences are particular sequences of 
nucleotides that a protein, DNA, or RNA molecule, or combinations thereof, 

15 (such as, but not limited to, a restriction endonuclease, a modification 
methylase and a recombinase) recognizes and binds. For example, a 
recognition sequence for Cre recombinase (see, e.g., SEQ ID No. 30) is a 34 
base pair sequence containing two 13 base pair inverted repeats (serving as 
the recombinase binding sites) flanking an 8 base pair core and designated 

20 loxP (see, e.g., Sauer (1994) Current Opinion in Biotechnology 5:521-527). 
Other examples of recognition sequences, include, but are not limited to, 
attB and attP, attR and attL and others (see, e.g. , SEQ ID Nos. 32-48), that 
are recognized by the recombinase enzyme Integrase (see, SEQ ID Nos. 49 
and 50) for the nucleotide and encoded amino acid sequences of an 

25 exemplary lambda phage integrase). 

The recombination site designated attB is an approximately 33 base 
pair sequence containing two 9 base pair core-type Int binding sites and a 7 
base pair overlap region; attP (SEQ ID No. 48) is an approximately 240 base 
pair sequence containing core-type Int binding sites and arm-type Int binding 

30 sites as well as sites for auxiliary proteins IHF, FIS, and Xis (see, e.g. , Landy 
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(1993) Current Opinion in Biotechnology 3:699-7071 see, e.g., SEQ ID Nos. 
32 and 48). 

As used herein, a recombinase is an enzyme that catalyzes the 
exchange of DNA segments at specific recombination sites. An integrase 
5 herein refers to a recombinase that is a member of the lambda (A) integrase 
family. 

As used herein, recombination proteins include excisive proteins, 
integrative proteins, enzymes, co-factors and associated proteins that are 
involved in recombination reactions using one or more recombination sites 

10 (see, Landy (1993) Current Opinion in Biotechnology 3:699-707). 

As used herein the expression "lox site" means a sequence of 
nucleotides at which the gene product of the ere gene, referred to 
herein as Cre, can catalyze a site-specific recombination event. A LoxP site 
is a 34 base pair nucleotide sequence from bacteriophage PI (see, e.g. , 

15 Hoess et al. (1982) Proc. Natl. Acad. Sci. U.S.A. 73:3398-3402). The LoxP 
site contains two 1 3 base pair inverted repeats separated by an 8 base pair 
spacer region as follows: (SEQ ID NO. 51): 

ATAACTTCGTATA ATGTATGC TATACGAAGTTAT 
£. co//DH5Alac and yeast strain BSY23 transformed with plasmid pBS44 

20 carrying two loxP sites connected with a LEU2 gene are available from the 
American Type Culture Collection (ATCC) under accession numbers ATCC 
53254 and ATCC 20773, respectively. The lox sites can be isolated from 
plasmid pBS44 with restriction enzymes £coRI and Sal\, or Xho\ and BamH\. 
In addition, a preselected DNA segment can be inserted into pBS44 at either 

25 the Sal\ or BamH\ restriction enzyme sites. Other lox sites include, but are 
not limited to, LoxB, LoxL, LoxC2 and LoxR sites, which are nucleotide 
sequences isolated from E. coii (see, e.g., Hoess et al. (1982) Proc. Natl. 
Acad. Sci. U.S.A. 7£?:3398). Lox sites can also be produced by a variety of 
synthetic techniques (see, e.g., Ito et al. (1982) Nuc. Acid Res. 70; 1755 and 

30 Ogilvie et al. ( 1 98 1 ) Science 270:270). 
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As used herein, the expression "ere gene" means a sequence of 
nucleotides that encodes a gene product that effects site-specific 
recombination of DNA in eukaryotic cells at lox sites. One ere gene can be 
isolated from bacteriophage P1 (see, e.g., Abremski eta/. (1983) Cell 
5 32:1301-131 1). E. coll DH1 and yeast strain BSY90 transformed with 
plasmid pBS39 carrying a ere gene isolated from bacteriophage P1 and a 
GAL1 regulatory nucleotide sequence are available from the American Type 
Culture Collection (ATCC) under accession numbers ATCC 53255 and ATCC 
20772, respectively. The ere gene can be isolated from plasmid pBS39 with 

10 restriction enzymes Xho\ and Sa/I. 

As used herein, site-specific recombination refers to site-specific 
recombination that is effected between two specific sites on a single nucleic 
acid molecule or between two different molecules that requires the presence 
of an exogenous protein, such as an integrase or recombinase. 

15 For example, Cre-lox site-specific recombination can include the 

following three events: 

a. deletion of a pre-selected DNA segment flanked by lox 

sites; 

b. inversion of the nucleotide sequence of a pre-selected 
20 DNA segment flanked by lox sites; and 

c. reciprocal exchange of DNA segments proximate to lox 
sites located on different DNA molecules. 

This reciprocal exchange of DNA segments can result in an integration 
event if one or both of the DNA molecules are circular. DNA segment refers 

25 to a linear fragment of single- or double-stranded deoxyribonucleic acid 
(DNA), which can be derived from any source. Since the lox site is an 
asymmetrical nucleotide sequence, two lox sites on the same DNA molecule 
can have the same or opposite orientations with respect to each other. 
Recombination between lox sites in the same orientation results in a deletion 

30 of the DNA segment located between the two lox sites and a connection 
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between the resulting ends of the original DNA molecule. The deleted DNA 
segment forms a circular molecule of DNA. The original DNA molecule and 
the resulting circular molecule each contain a single lox site. Recombination 
between lox sites in opposite orientations on the same DNA molecule result 
5 in an inversion of the nucleotide sequence of the DNA segment located 
between the two lox sites. In addition, reciprocal exchange of DNA 
segments proximate to lox sites located on two different DNA molecules can 
occur. AH of these recombination events are catalyzed by the gene product 
of the ere gene. Thus, the Cre-lox system can be used to specifically delete, 

10 invert, or insert DNA. The precise event is controlled by the orientation of 
lox DNA sequences, in ess the lox sequences direct the Cre recombinase to 
either delete (lox sequences in direct orientation) or invert (lox sequences in 
inverted orientation) DNA flanked by the sequences, while in trans the lox 
sequences can direct a homologous recombination event resulting in the 

15 insertion of a recombinant DNA. 

As used herein, a plant refers to an organism that is taxonomically 
classifed as being in the kingdom PJantae. Such organisms include 
eukaryotic organisms that contain chloroplasts capable of carrying out 
photosynthesis. A plant can be unicellular or multicellular and can contain 

20 multiple tissues and/or organs. Plants can reproduce sexually and/or 

asexually and include species that are perennial or annual in growth habit. A 
plants can be found to exist in a variety of habitats, including terrestrial and 
aquatic environments. The term "plant" includes a whole plant, plant cell, 
plant protoplast, plant calli, plant seed, plant organ, plant tissue, and other 

25 parts of a whole plant. 

As used herein, reproductive mode with reference to a plant refers to 
any and all methods by which a plant produces progeny. Reproductive 
modes include, but are not limited to, sexual and asexual reproduction. 
Plants may produce progeny by one or multiple reproductive modes. Sexual 

30 reproduction can include union of cells derived from haploid gametophytes 
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(e.g. , eggs produced from ovules and sperm produced from pollen in seed 
plants) to form diploid zygotes. Zygotes may be formed from gametophytes 
from different plants or from gametophytes of the same plant {e.g. , through 
self-fertilization). Asexual reproduction can occur when offspring are 
5 produced through modifications of the sexual life cycle that do not include 
meiosis and syngamy. For example, when vascular plants reproduce 
asexually, they may do so by vegetative reproduction, such as budding, 
branching, and tillering, or by producing spores or seed genetically identical 
to the sporophytes that produced them. 

10 As used herein, stable maintenance of chromosomes occurs when at 

least about 85%, preferably 90%, more preferably 95%, of the cells retain 
the chromosome. Stability is measured in the presence of a selective agent. 
Preferably these chromosomes are also maintained in the absence of a 
selective agent. Stable chromosomes also retain their structure during cell 

15 culturing, suffering no unintended intrachromosomal nor interchromosomal 
rearrangements. 

As used herein, BrdU refers to 5-bromodeoxyuridine, which during 
replication is inserted in place of thymidine. BrdU is used as a mutagen; it 
also inhibits condensation of metaphase chromosomes during cell division. 

20 As used herein, ribosomal RNA (rRNA) is the specialized RNA that 

forms part of the structure of a ribosome and participates in the synthesis of 
proteins. Ribosomal RNA is produced by transcription of genes which, in 
eukaryotic cells, are present in multiple copies. In human cells, the 
approximately 250 copies of rRNA genes (i.e., genes which encode rRNA) 

25 per haploid genome are spread out in clusters on at least five different 

chromosomes (chromosomes 13, 14, 15, 21 and 22). In mouse cells, the 
presence of ribosomal DNA (rDNA, which is DNA containing sequences that 
encode rRNA) has been verified on at least 1 1 pairs out of 20 mouse 
chromosomes (chromosomes 5, 6, 7, 9, 11, 12, 15, 16, 17, 18, and 19) 

30 [see e.g., Rowe era/. (1996) Mamm. Genome 7:886-889 and Johnson et al. 



WO 2002/096923 PCT/US2002/0 17451 



-42- 

(1993) Mamm. Genome 4:49-52]. In Arabidopsis thaliana the presence of 
rDNA has been verified on chromosomes 2 and 4 (18S, 5.8S, and 25S 
rDNA) and on chromosomes 3,4, and 5 (5S rDNA)[see The Arabidopsis 
Genome Initiative (2000) Nature 405:796-815]. In eukaryotic cells, the 
5 multiple copies of the highly conserved rRNA genes are located in a tandemly 
arranged series of rDNA units, which are generally about 40-45 kb in length 
and contain a transcribed region and a nontranscribed region known as 
spacer (i.e. f intergenic spacer) DNA which can vary in length and sequence. 
In the human and mouse, these tandem arrays of rDNA units are located 

TO adjacent to the pericentric satellite DNA sequences (heterochromatin). The 
regions of these chromosomes in which the rDNA is located are referred to 
as nucleolar organizing regions (NOR) which loop into the nucleolus, the site 
of ribosome production within the cell nucleus. In higher plants, the rDNA is 
arragened in long tandem repeating units, similar to those of other higher 

15 eukaroytes. The 18S, 5.8S and 25S rRNA genes are clustered and are 
transcribed as one unit, while the 5S genes are located elsewhere in the 
genome. Between the 3' end of the 25S gene and the 5' end of the 18S 
gene is located a DNA spacer that ranges from 1 kb to greater than 1 2 kb in 
length for different species. Therefore, the rDNA repeat ranges from about 4 

20 kb to about 1 5 kb for different plant species [see, e.g., Rogers and Bendich 
(1987) Plant Mol. Biol. 9.- 509-5 20]. 

As used herein, a megachromosome refers to a chromosome that, 
except for introduced heterologous DNA, is substantially composed of 
heterochromatin. Megachromosomes are made up of an array of repeated 

25 amplicons that contain two inverted megareplicons bordered by introduced 
heterologous DNA [see, e.g.. Figure 3 of U.S. Patent No. 6,077,697 for a 
schematic drawing of a megachromosome]. For purposes herein, a 
megachromosome is about 50 to 400 Mb, generally about 250-400 Mb. 
Shorter variants are also referred to as truncated megachromosomes [about 

30 90 to 120 or 150 Mb], dwarf megachromosomes [— 150-200 Mb] and cell 
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lines, and a micro-megachromosome [ — 50-90 Mb, typically 50-60 Mb]. For 
purposes herein, the term megachromosome refers to the overall repeated 
structure based on an array of repeated chromosomal segments (amplicons) 
that contain two inverted megareplicons bordered by any inserted 
5 heterologous DNA. 

As used herein, transformation and transfection are used 
interchangeably to refer to the process of introducing nucleic acid 
introduced into cells. The terms transfection and transformation refer to the 
taking up of exogenous nucleic acid, e.g. , an expression vector, by a host 

lO cell whether or not any coding sequences are in fact expressed. Numerous 
methods of introducing nucleic acids into cells are known to the ordinarily 
skilled artisan, for example, by Agrobacterium-medlated transformation, 
protoplast transfection (including polyethylene glycol (PEG)-mediated 
transfection, electroporation, protoplast fusion, and microcell fusion), lipid- 

15 mediated delivery, liposomes, electroporation, microinjection, particle 

bombardment and silicon carbide whisker-mediated transformation (see, e.g. , 
Paszkowski eta/. (1984) EMBO J. 3:2717-2722; Potrykus eta/. (1985) Mol. 
Gen. Genet. 733:169-177; Reich et a/. (1986) Biotechnology 4:1001-1004; 
Klein eta/. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 

20 Paszkowski et a/. (1 989) in Ceit Cufture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame era/. 
(1994) Plant J. £:941-948), direct uptake using calcium phosphate [CaP04; 
see.e-sr., Wigler et al. (1979) Proc. Natl. Acad. Set. U.S.A. 76:1373-1376], 

25 polyethylene glycol [PEG]-mediated DNA uptake, lipofection [see, e.g., 

Strauss (1996) Meth. Mol. Biol. 54:307-327], microcell fusion [see Lambert 
(1991) Proc. Natl. Acad. ScL U.S.A. 55:5907-591 1; U.S. Patent No. 
5,396,767, Sawford et al. (1987) Somatic Cell Mol. Genet. 73:279-284; 
Dhar et al. (1984) Somatic Cell Mol. Genet. 70:547-559; and McNeill-Killary 

30 et al. (1995) Meth. Enzymol. 254:133-152], lipid-mediated carrier systems 
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[see, e.g., Teifel et al. (1995) Biotechniques 73:79-80; Albrecht eta/. (1996) 
Ann. HematoL 72:73-73; Holmen et aL (1995) In Vitro Cell Dev. Biol. Anim. 
37:347-351; Remy et al. (1994) Bioconjug. Chem. 5:647-654; Le Bolch et 
al. (1995) Tetrahedron Lett. 35:6681-6684; Loeffler et al. (1993) Meth. 
5 Enzymol. 2/7:599-618] or other suitable method. Successful transfection is 
generally recognized by detection of the presence of the heterologous nucleic 
acid within the transfected cell, such as, for example, any visualization of the 
heterologous nucleic acid or any indication of the operation of a vector within 
the host cell. 

10 As used herein, injected refers to the microinjection (use of a small 

syringe, needle, or pipette) of nucleic acid into a cell. 

As used herein, gene therapy involves the transfer or insertion of 
nucleic acid molecules into certain cells, which are also referred to as target 
cells, to produce products that are involved in preventing, curing, correcting, 

15 controlling or modulating diseases, disorders and/or deleterious conditions. 
The nucleic acid is introduced into the selected target cells in a manner such 
that the nucleic acid is expressed and a product encoded thereby is 
produced. Alternatively, the nucleic acid may in some manner mediate 
expression of DNA that encodes a therapeutic product. This product may be 

20 a therapeutic compound, which is produced in therapeutically effective 

amounts or at a therapeutically useful time. It may also encode a product, 
such as a peptide or RNA, that in some manner mediates, directly or 
indirectly, expression of a therapeutic product. Expression of the nucleic 
acid by the target cells within an organism afflicted with a disease or 

25 disorder thereby enables modulation of the disease or disorder. The nucleic 
acid encoding the therapeutic product may be modified prior to introduction 
into the cells of the afflicted host in order to enhance or otherwise alter the 
product or expression thereof. 

For use in gene therapy, cells can be transfected in vitro, followed by 

30 introduction of the transfected cells into an organism. This is often referred 
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to as ex vivo gene therapy. Alternatively, the cells can be transfected 
directly in vivo within an organism. 

As used herein, a therapeutically effective product is a product that 
effectively ameliorates or eliminates the symptoms or manifestations of an 
5 inherited or acquired disease or disorder or that cures said disease or disorder 
in an organism. For example, therapeutically effective products include a 
product that is encoded by heterologous DNA expressed in a diseased 
organism and a product produced from heterologous DNA in a host ceil and 
to which a diseased organism is exposed. 

10 As used herein, a transgenic plant refers to a plant {e.g., a plant cell, 

tissue, organ or whole plant) containing heterologous or foreign nucleic acid 
or in which the expression of a gene naturally present in the plant has been 
altered. Heterologous nucleic acid within a transgenic plant may be 
transiently or stably maintained within the plant. Stable maintenance of 

15 heterologous nucleic acid may be maintenance of the nucleic acid through 
one or more, or two or more, or five or more, or ten or more, or 25 or more, 
or 50 or more or 60 or more cell divisions. A transgenic plant may contain 
heterologous nucleic acid in one cell, multiple cells or all cells. A transgenic 
plant may produce progeny that contain or do not contain the heterologous 

20 nucleic acid. 

As used herein, a promoter, with respect to a region of DNA, refers to 
a sequence of DNA that contains a sequence of bases that signals RNA 
polymerase to associate with the DNA and initiate transcription of messenger 
RNA (mRNA) from a template strand of the DNA. A promoter thus generally 

25 regulates transcription of DNA into mRNA. 

As used herein, operative linkage of heterologous DNA to regulatory 
and effector sequences of nucleotides, such as promoters, enhancers, 
transcriptional and translational stop sites, and other signal sequences refers 
to the relationship between such DNA and such sequences of nucleotides. 

30 For example, operative linkage of heterologous DNA to a promoter refers to 




WO 2002/096923 



PCT/US2002/017451 



-46- 



10 



15 



20 



25 



the physical relationship between the DNA and the promoter such that the 
transcription of such DNA is initiated from the promoter by an RNA 
polymerase that specifically recognizes, binds to and transcribes the DNA in 
reading frame. 

As used herein, isolated, substantially pure nucleic acid, such as, for 
example, DNA, refers to nucleic acid fragments purified according to 
standard techniques employed by those skilled in the art, such as that found 
in Maniatis eta/. [(1982) Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY]. 

As used herein, expression refers to the transcription and/or 
translation of nucleic acid. For example, expression can be the transcription 
of a gene into an RNA molecule, such as a messenger RNA (mRNA) 
molecule. Expression may further include translation of an RNA molecule 
into peptides, polypeptides, or proteins. If the nucleic acid is derived from 
genomic DNA, expression may, if an appropriate eukaryotic host cell or 
organism is selected, include splicing of the mRNA. With respect to an 
antisense construct, expression may refer to the transcription of the 
antisense DNA. 

As used herein, vector or plasmid refers to discrete elements that are 
used to introduce heterologous nucleic acids into cells for either expression 
of the heterologous nucleic acid or for replication of the heterologous nucleic 
acid. Selection and use of such vectors and plasmids are well within the 
level of skill of the art. 

As used herein, substantially homologous DNA refers to DNA that 
includes a sequence of nucleotides that is sufficiently similar to another such 
sequence to form stable hybrids under specified conditions. 

It is well known to those of skill in this art that nucleic acid fragments 
with different sequences may, under the same conditions, hybridize 
detectably to the same "target" nucleic acid. Two nucleic acid fragments 
hybridize detectably, under stringent conditions over a sufficiently long 
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hybridization period, because one fragment contains a segment of at least 
about 14 nucleotides in a sequence which is complementary (or nearly 
complementary) to the sequence of at least one segment in the other nucleic 
acid fragment. If the time during which hybridization is allowed to occur is 
5 held constant, at a value during which, under preselected stringency 

conditions, two nucleic acid fragments with exactly complementary base- 
pairing segments hybridize detectably to each other, departures from exact 
complementarity can be introduced into the base-pairing segments, and base- 
pairing will nonetheless occur to an extent sufficient to make hybridization 

10 detectable. As the departure from complementarity between the base-pairing 
segments of two nucleic acids becomes larger, and as conditions of the 
hybridization become more stringent, the probability decreases that the two 
segments will hybridize detectably to each other. 

Two single-stranded nucleic acid segments have "substantially the 

15 same sequence," within the meaning of the present specification, if (a) both 
form a base-paired duplex with the same segment, and (b) the melting 
temperatures of said two duplexes in a solution of 0.5 X SSPE differ by less . 
than 10°C. If the segments being compared have the same number of 
bases, then to have "substantially the same sequence", they will typically 

20 differ in their sequences at fewer than 1 base in 10. Methods for determining 
melting temperatures of nucleic acid duplexes are well known [see, e.g. . 
Meinkoth and Wahl (1984) Anal. Biochem . 138 :267-284 and references 
cited therein]. 

As used herein, a nucleic acid probe is a DNA or RNA fragment that 
25 includes a sufficient number of nucleotides to specifically hybridize to DNA or 
RNA that includes identical or closely related sequences of nucleotides. A 
probe may contain any number of nucleotides, from as few as about 10 and 
as many as hundreds of thousands of nucleotides. The conditions and 
protocols for such hybridization reactions are well known to those of skill in 
30 the art as are the effects of probe size, temperature, degree of mismatch, 
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salt concentration and other parameters on the hybridization reaction. For 
example, the lower the temperature and higher the salt concentration at 
which the hybridization reaction is carried out, the greater the degree of 
mismatch that may be present in the hybrid molecules. 
5 To be used as a hybridization probe, the nucleic acid is generally 

rendered detectable by labelling it with a detectable moiety or label, such as 
32 P, 3 H and 14 C, or by other means, including chemical labelling, such as by 
nick-translation in the presence of deoxyuridylate biotinylated at the 5'- 
position of the uracil moiety. The resulting probe includes the biotinylated 

lO uridylate in place of thymidylate residues and can be detected (via the biotin 
moieties) by any of a number of commercially available detection systems 
based on binding of streptavidin to the biotin. Such commercially available 
detection systems can be obtained, for example, from Enzo Biochemicals, 
Inc. (New York, NY). Any other label known to those of skill in the art, 

15 including non-radioactive labels, may be used as long as it renders the probes 
sufficiently detectable, which is a function of the sensitivity of the assay, the 
time available (for culturing cells, extracting DNA, and hybridization assays), 
the quantity of DNA or RNA available as a source of the probe, the particular 
label and the means used to detect the label. 

20 Once sequences with a sufficiently high degree of homology to the 

probe are identified, they can readily be isolated by standard techniques, 
which are described, for example, by Maniatis eta/. [(1982) Molecular 
Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, NY]. 

25 As used herein, conditions under which DNA molecules form stable 

hybrids and are considered substantially homologous are such that DNA 
molecules with at least about 60% complementarity form stable hybrids. 
Such DNA fragments are herein considered to be "substantially 
homologous". For example, DNA that encodes a particular protein is 

30 substantially homologous to another DNA fragment if the DNA forms stable 
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hybrids such that the sequences of the fragments are at least about 60% 
complementary and if a protein encoded by the DNA retains its activity. 

For purposes herein, the following stringency conditions are defined: 
1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 °C 
5 2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1.0 x SSPE, 0.1% SDS, 50°C 
or any combination of salt and temperature and other reagents that result in 
selection of the same degree of mismatch or matching. 

As used herein, all assays and procedures, such as hybridization 
10 reactions and antibody-antigen reactions, unless otherwise specified, are 
conducted under conditions recognized by those of skill in the art as 
standard conditions. 

A. Amplification of Chromosomal Segments and Use Thereof in the 
Generation of Artificial Chromosomes 

15 The methods, cells and artificial chromosomes provided herein are 

produced by virtue of the discovery of the existence of a higher-order 
replication unit (megareplicon) of the centromeric region, including the 
pericentric DNA, of a chromosome. This megareplicon is delimited by a 
primary replication initiation site (megareplicator), and appears to facilitate 

20 replication of the centromeric heterochromatin, and, most likely, 

centromeres. Integration of heterologous nucleic acid into the megareplicator 
region, or in close proximity thereto, initiates a large-scale amplification of 
megabase-size chromosomal segments. Products of such amplification may 
be used as artificial chromosomes or in the generation of artificial 

25 chromosomes as described herein. 

Included among the DNA sequences that may provide a 
megareplicator are the rDNA units that give rise to ribosomal RNA (rRNA). In 
plants and animals, particularly mammals such as mice and humans, these 
rDNA units can contain specialized elements, such as the origin of replication 

30 (or origin of bidirectional replication, i.e., OBR, in mouse) and amplification 
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promoting sequences (APS) and amplification control elements (ACE) [see, 
e.g., with respect to plant rDNA, U.S. Patent Nos. 6,096,546 (to Raskin) and 
6,100,092 (to Borysyuk era/.); PCT International Application Publication No. 
WO99/66058; Genbank Accession no. Y08422 (containing the central AT- 
5 rich region of a tobacco rDNA intergenic spacer); Borysyuk et al. (1997) 
Plant MoL BioL 35:655-660); Borysyuk et al.. (2000) Nature Biotechnology 
75:1303-1306; Hernandez et al. (1993) EM BO J. 12: 1 475-1 485; Van't Hof 
and Lamm (1992) Plant MoL BioL 20:377 382; Hernandez et al. (1988) Plant 
MoL Biol. 70:413-322; and with respect to mammalian rDNA, Gogel et al. 

10 (1996) Chromosoma 704:51 1 -51 8; Coffman ef a/. (1993) Exp. Cell. Res. 

209: 123-1 32; Little et al. (1993) MoL Cell. Biol. 73:6600-6613; Yoon et al. 
(1995) MoL Cell. Biol. 75:2482-2489; Gonzalez and Sylvester (1995) 
Genomics 27:320-328; Miesfeld and Arnheim (1982) Nuc. Acids Res. 
70:3933-3949; Maden et al. (1987) Biochem. J. 246:519-527]. 

15 As described herein, without being bound by any theory, specialized 

elements such as these may facilitate replication and/or amplification of 
megabase-size chromosomal segments in the de novo formation of 
chromosomes, such as the artificial chromosomes described herein, in cells. 
These specialized elements are typically located in the nontranscribed 

20 intergenic spacer region upstream of the transcribed region of rDNA. The 
intergenic spacer region may itself contain internally repeated sequences 
which can be classified as tandemly repeated blocks and nontandem blocks 
(see e. sr., Gonzalez and Sylvester (1995) Genomics 27:320-328). In mouse 
rDNA, an origin of bidirectional replication may be found within a 3-kb 

25 initiation zone centered approximately 1 .6 kb upstream of the transcription 
start site (see, e.g., Gogel era/. (1996) Chromosoma 704:511-518). The 
sequences of these specialized elements tend to have an altered chromatin 
structure, which may be detected, for example, by nuclease hypersensitivity 
or the presence of AT-rich regions that can give rise to bent DNA structures. 

30 
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Sequences of intergenic spacer regions of plant rDNA include, but are 
not limited to, sequences contained in GenBank Accession numbers S70723 
(from the 5S rDNA of barley (Hordeum vulgare)), AFO13103 and X03989 
(from maize (Zea mays)), X65489 (from potato (Sotanum tuberosum)), 
5 X52265 (from tomato (Lycopersicon escufentum)) , AF177418 (from 

Arabidopsis neglecta), AF1 77421 and AF 17422 (from Ar a bid op sis haiieri), 
A71562, X15550, and X52631 (1 rom Arabidopsis tha/iana; see Gruendler et 
al. (1991) J. Mol. Biol. 22 7:1209-1222 and Gruendler eta/. (1989) Nucleic 
Acids Res. 1 7:6395-6396), X54194 (from rice (Oryza sativa)) and Y08422 

10 and D76443 (from tobacco {Nicotiana tabacum). Sequences of intergenic 

spacer regions of plant rDNA further include sequences from rye (see Appels 
et aL (1986) Can. J. Genet. Cytol. 25:673-685), wheat (see Barker et al. 
(1988) J. Mol. Biol. 207:1-17 and Sardana and Flavell (1996) Genome 
33:288-292), radish (see Delcasso-Tremousaygue et al. (1988) Eur. J. 

15 Biochem. 7 72:767-776), V/cia faba and Pisum sativum (see Kato et al. 

(1990) Plant Mol. Biol. 74:983-993), mung bean (see Gerstner et al. (1988) 
Genome 30:723-733; and Schiebel et al. (1989) Mol. Gen. Genet. 273:302- 
307), tomato (see Schmidt-Puchta et al. (1989) Plant Mol. Biol. 73:251- 
253), Hordeum bu/bosum (see Procunier et al. (1990) Plant Mol. Biol. 

20 75:661-663) and Lens culinaris Medik., and other legume species (see 
Fernandez et al. (2000) Genome 43:597-603). Nucleic acids containing 
intergenic spacer sequences from plants can be obtained by nucleic acid 
amplification of DNA from plant cells using oligonucleotide primers 
corresponding to the 3' end of the conserved 25S mature rRNA encoding 

25 region and the 5' end of the conserved 18S mature rRNA encoding region 
(see e.g., PCT Application Publication No. W098/1 3505). 

An exemplary sequence encompassing a mammalian origin of 
replication is provided in GENBANK accession no. X82564 at about positions 
2430-5435. Exemplary sequences encompassing mammalian amplification- 

30 promoting sequences include nucleotides 690-1060 and 1105-1530 of 
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GENBANK accession no. X82564 and are also provided in PCT Application 
Publication No. WO 97/40183. Exemplary sequences encompassing plant 
amplification-promoting sequences (APS) include those provided in U.S. 
Patent No. 6,100,092. 
5 In human rDNA, a primary replication initiation site may be found a 

few kilobase pairs upstream of the transcribed region and secondary initiation 
sites may be found throughout the nontranscribed intergenic spacer region 
(see, e.g., Yoon eta/. (1995) Mol. Cell. Biol. 75:2482-2489). A complete 
human rDNA repeat unit is presented in GENBANK as accession no. U13369. 

lO Another exemplary sequence encompassing a replication initiation site may 
be found within the sequence of nucleotides 35355-42486 in GENBANK 
accession no. U 13369 particularly within the sequence of nucleotides 
37912-42486 and more particularly within the sequence of nucleotides 
37912-39288 of GENBANK accession no. U 13369 (see Coffman et al. 

15 (1993) Exp. Cell. Res. 205:123-132). 

B. Preparation of Plant Artificial Chromosomes 

Cell lines containing artificial chromosomes can be prepared by 
transforming cells, preferably a stable cell line, with heterologous nucleic acid 
and identifying cells that contain an artificial chromosome as described 

20 herein. The artificial chromosome is a chromosomal structure that is distinct 
from any chromosome that existed in the cell prior to introduction of the 
heterologous nucleic acid. A cell containing an artificial chromosome may be 
identified using a variety of procedures, alone or in combination, as described 
in detail herein. In particular embodiments of the methods described herein, 

25 the heterologous nucleic acid contains a sequence that targets the nucleic 
acid to an amplifiable region of a chromosome in the cell, such as, for 
example, the pericentric heterochromatin and/or rDNA. A variety of targeting 
sequences are provided herein. 

Prior to analyzing transformed cells for the presence of an artificial 

30 chromosome, the cells to be analyzed may be enriched with artificial 



WO 2002/096923 PCT/US2002/017451 



-53- 

chromosome-containing cells using a variety of techniques depending on the 
heterologous nucleic acid that was introduced into the host cell to initiate 
generation of the artificial chromosomes. For example, if nucleic acid 
encoding a selectable marker was included in the heterologous nucleic acid, 
5 cells containing the marker may be selected for analysis. If the selectable 
marker is one that confers resistance to a cytotoxic agent, e.g., bialaphos, 
hygromycin or kanamycin, the transformed cells may be cultured under 
selective conditions which include the agent. Cells surviving growth under 
selective conditions are then analyzed for the presence of artificial 

10 chromosomes. If the selectable marker is a readily detectable reporter 

molecule, such as, for example, a fluorescent protein, the transformed cells 
may be selected on the basis of fluorescent properties. For example, cells 
containing the fluorescent protein may be isolated from nontransformed cells 
using a fluorescence-activated cell sorter (FACS). 

15 In analyzing transformed cells for the presence of artificial 

chromosomes, it is also possible to identify cells that have a multicentric, 
typically dicentric, chromosome, formerly multicentric (typically dicentric) 
chromosome, minichromosome and/or heterochromatic structures, such as a 
megachromosome and a sausage chromosome. If cells containing 

20 multicentric chromosomes or formerly mulitcentric (typically formerly 
dicentric) chromosomes are initially selected, these cells can then be 
manipulated, if need be, as described herein to produce the 
minichromosomes and other artificial chromosomes, particularly the 
heterochromatic artificial chromosomes and other segmented, repeat region- 

25 containing artificial chromosomes, as described herein. 

1 . Cells used in the generation of plant artificial chromosomes 

Any cells harboring plant centromere-containing chromosomes may be 
used in the generation of plant artificial chromosomes (PACs). Such cells 
30 include, but are not limited to, plant cells, protoplasts, and cells that are 
hybrid cells of one or more plant species. Preferred cells are those that 
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harbor plant centromere-containing chromosomes and are readily susceptible 
to the introduction of heterologous nucleic acids therein. 

Cells for use in the generation of plant artificial chromosomes include 
cells that harbor acrocentric plant chromosomes. Examples of acrocentric 
5 plant chromosomes include chromosomes 2 and 4 of the plant Arabidopsis 
thaliana (see, e.g., Mayer et a/. (1999) Nature 402:769-777; Murata et at. 
(1997) The Plant Journal 72:31-37; The Arabidopsis Genome initiative 
(2000) Nature 405:796-815), four acrocentric chromosome pairs in 
Heiianthus annuus (sunflower; see Schrader et at. (1997) Chromosome Res. 

10 5:451-456), two pairs of acrocentric chromosomes in domesticated pepper 
plant (Capsicum annuum) and a nearly acrocentric chromosome in lentil 
plant. In particular embodiments of the methods described herein, cells 
harboring acrocentric plant chromosomes containing rDNA are used in 
generating plant artificial chromosomes. 

15 Plant species from which cells may be obtained include, but are not 

limited to, vegetable crops, fruit and vine crops, field plants, bedding plants, 
trees, shrubs, and other nursery stock. Examples of vegetable crops include 
artichokes, kohlrabi, arugula, leeks, asparagus, lettuce, bok choy, malanga, 
broccoli, melons (e.g. , muskmelon, watermelon, crenshaw, honeydew, 

20 cantaloupe), brussel sprouts, cabbage, cardoni, carots, napa, cauliflower, 

okra, onions, celery, parsley, chick peas, parsnips, chicory, Chinese cabbage, 
peppers, collards, potatoes, cucumber plants, pumpkins, cucurbits, radishes, 
dry bulb onions, rutabaga, eggplant, salsify, escarole, shallots, endive, garlic, 
spinach, green onions, squash, greens, beet, sweet potatoes, Swiss chard, 

25 horseradish, tomatoes, kale, turnips and spices. Fruit and vine crops include 
apples, apricots, cherries, nectarines, peaches, pears, plums, prunes, quince, 
almonds, chestnuts, filberts, pecans, pistachios, walnuts, citrus, blueberries, 
boysenberries, cranberries, currants, loganberries, raspberries, strawberries, 
blackberries, grapes, avocados, bananas, kiwi, persimmons, pomegrante, 

30 pineapple, tropical fruits, pomes, melon, mango, papaya and lychee. 
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Field crop plants include evening primrose, meadow foam, corn, 

maize, hops, jojoba, peanuts, rice, safflower, small grains (barley, oats, rye, 

wheat, and others) sorghum, tobacco, kapok, leguminous plants (beans, 

lentils, peas, soybeans), oil plants (canola, rape, mustard, poppy, olives, 

5 sunflowers, coconut, castor oil plants, cocoa beans, groundnuts), fibre plants 

(cotton, flax, hemp, jute), lauraceae (cinnamon, camphor) and plants such as 

coffee, sugarcane, tea and natural rubber plants. Other examples of plants 

include bedding plants such as flowers, cactus, succulents and ornamental 

plants, as well as trees such as forest (broad-leaved trees and evergreens, 

10 such as conifers), fruit, ornamental and nut-bearing trees, shrubs, algae, 

moss, and duckweed. 

2. Heterologous nucleic acids for use in generating plant artificial 
chromosomes 

a. Selectable markers 

15 The heterologous nucleic acid that is introduced into a cell in the 

generation of artificial chromosomes as described herein may include nucleic 
acid encoding a selectable marker. Any nucleic acid that includes a 
selectable marker sequence may be introduced into cells harboring plant 
centromere-containing chromosomes for the generation of plant artificial 

20 chromosomes. Examples of selectable markers include, but are not limited 
to, DNA encoding a product that confers resistance to a cytotoxic or 
cytostatic agent and DNA encoding a readily detectable product, such as a 
reporter protein. 

(1) Nucleic acids encoding products that confer 
25 resistance to a selection agent 

Examples of selectable markers include the dihydrylf olate reductase 

(dhfr) gene, hygromycin phosphotransferase genes, the phosphinothricin 

acetyl transferase gene (bar gene) and neomycin phosphotransferase genes. 

Selectable markers that can be used in animal, e.g. , mammalian cells include, 

30 but are not limited to the thymidine kinase gene and the cellular adenine- 

phosphribosyltransferase gene. 
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Of particular interest for purposes herein are nucleic acid selectable 
markers that, upon expression in the host cell, confer antibiotic or herbicide 
resistance to the cell, sufficient to provide for the maintenance of 
heterologous nucleic acids in the cell, and which facilitate the transfer of 
5 artificial chromosomes containing the marker DNA into new host cells. 
Examples of such markers include DNA encoding products that confer 
cellular resistance to hygromycin, kanamycin, G418, bialaphos, Basta, 
methotrexate, glyphosate, and puromycin. For example, neo (or nptll) 
provides kanamycin resistance and can be selected for using kanamycin, 

lO G418, paromomycin and other agents [see, e.g., Messing and Vierra (1982) 
Gene 79:259-268; and Bevan eta/. (1983) Nature 504:184-187]; bar from 
Steptornyces hygroscopicus, which encodes the enzyme phosphinothricin 
acetyl transferase (PAT) confers bialaphos, glufosinate, Basta or 
phosphinothricin resistance [see e.g. , White et al. (1990) Nuc. Acids Res. 

15 7#:1062; Spencer et al. (1990) Theor. Appl. Genet. 75:625-631; Vickers et 
al. (1996) Plant Mot. Biol. Reporter 74:363-368; and Thompson et al. (1987) 
EMBO J. 5:2519-2523]; the hph gene which confers resistance to the 
antibiotic hygromycin (see, e.g., Blochinger and Diggelmann, Mol. Cell. Biol. 
4:2929-2931); a mutant EPSP synthase protein [see Hinchee et al. (1988) 

20 Bio/technol 5:915-922] confers glyphosate resistance (see also U.S. Patent 
Nos. 4,940,935 and 5,188,642); and a nitrilase such as bxn from Klebsiella 
ozaenae confers resistance to bromoxynil [see Stalker et al. (1988) Science 
242:419-42]. DNA encoding cystathionine gamma-synthase (CGS) can be 
used as a marker that confers resistance to ethionine (see PCT Application 

25 Publication No. WO 00/55303). Examples of markers that can be used in 
animal, e.g. . mammalian cells, include but are not limited to DNA encoding 
products that confer cellular resistance to streptomycin, zeocin, 
chloramphenicol and tetracycline. 

(2) Reporter Molecules 
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Nucleic acids encoding reporter molecules may also be included in the 
nucleic acid that is introduced into a recipient cell in the generation of 
artificial chromosomes. Reporter genes provide a means for identifying cells 
and chromosomes into which heterologous nucleic acids have been 
5 transferred and further provide a means for assessing whether or not, and to 
what extent, transferred DNA is expressed. 

Nucleic acids encoding reporter molecules that may be used in 
monitoring transfer and expression of heterologous nucleic acids into cells, 
particularly plant cells include, but are not limited to, nucleic acid encoding 0- 
10 glucuronidase (GUS) or the uidA gene product, which is an enzyme for which 
various chromogenic substrates are known [see Novel and Novel (1973) Mo/. 
Gen. Genet. 720:3 19-335; Jefferson et a/. (1986) Proc. NatL Acad. Sci. 
USA 53:8447-8451; US Patent No. 5,268,463; commercially available from 
Clontech Laboratories, Palo Alto, CA], DNA from an R-locus gene, which 

15 encodes a product that regulates the production of anthocyanin pigments 
(red color) in plant tissues [see, e.g., Dellaporta eta/. (1988) In 
"Chromosome Structure and Function; Impact of New Concepts, 18th 
Stad/er Genetics Sympsium" //:263-282], nucleic acid encoding ^-lactamase 
[Sutcliffe (1978) Proc. NatL Acad. Sci. U.S.A. 75:3737-3741] which is an 

20 enzyme for which various chromogenic substrates are known {e.g. , PADAC, 
a chromogenic cephalosporin), DNA from a xy/E gene [see, e.g., Zukowsky 
eta/. (1983) Proc. Nat/. Acad. Sci. U.S.A. 50:1101-1105], which encodes a 
catechol dioxygenase that can convert chromogenic catechols; nucleic acid 
encoding a-amylase [see, e.g., Ikuta eta/. (1990) Bio/techno/. 5:241-242], 

25 nucleic acid encoding tyrosinase [see, e.g., Katz et a/. (1983) J. Gen. 

Microbio/. 129:2703-2714], an enzyme capable of oxidizing tyrosine to 
DOPA and dopaquinone which in turn condenses to form the readily 
detectable compound melanin, nucleic acid encoding /7-gaiactosidase, an 
enzyme for which there are chromogenic substrates, nucleic acid encoding 

30 lucif erase (fux) gene [see, e.g., Ow et aL (1986) Science 234:856-859] 
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which allows for bioluminesence detection, nucleic acid encoding aequorin 
[see, e.g., Prasher era/. (1985) Biochem. Biophy. Res. Common. 72^:1259- 
1 268] which may be employed in calcium-sensitive bioluminescence 
detection, nucleic acid encoding a green fluorescent protein (GFP) [see, e.g. , 
5 Sheen eta/. (1995) Plant J. 5:777-784; Haselhoff et al. (1997) Proc. Natl. 
Acad. Set. U.S.A. 54:2122-2127; Hasseloff and Amos (1995) Trends Genet 
7 7:328*329; Reichel et at. (1996) Proc. Natl. Acad. Sci. U.S.A. 33:5888- 
5893; Tian et al. (1997) Plant Cell Rep. 75:267-271; Prasher et al. (1 992) 
Gene 7 7 7:229-233; Chalfie era/. (1994) Science 263:802; PCT Application 

lO Publication Nos. W097/41 228 and WO 95/07463; and commercially 

available from Clontech Laboratory's, Palo Alto, CA), nucleic acid encoding a 
red or blue fluorescent protein (RFP or BFP, respectively), or nucleic acid 
encoding chloramphenicol acetyltransferase (CAT). 

Enhanced GFP (EGFP) is a mutant of GFP with a 35-fold increase in 

15 fluorescence. This variant has mutations of Ser to Thr at amino acid 65 and 
Phe to Leu at position 64 and is encoded by a gene with optimized human 
codons (see, e.g., U.S. Patent No. 6,054,312). EGFP is a red-shifted variant 
of wild-type GFP (Yang et al. (1996) Nucl. Acids Res. 24:4592-4593; Haas 
et al. (1996) Curr. Biol. 6:315-324; Jackson et al. (1990) Trends Biochem. 

20 75:477-483) that has been optimized for brighter fluorescence and higher 
expression in mammalian cells (excitation maximum = 488 nm; emission 
maximum = 507 nm). EGFP encodes the GFPmutl variant (Jackson (1990) 
Trends Biochem. 75:477-483) which contains the double-amino-acid 
substitution of Phe-64 to Leu and Ser-65 to Thr. Sequences flanking EGFP 

25 have been converted to a Kozak consensus translation initiation site (Huang 
et al. (1990) Nucleic Acids Res. 18: 937-947) to further increase the 
translation efficiency in eukaryotic cells. 

Nucleic acid from the maize R gene complex can also be used as 
nucleic acid encoding a reporter molecule. The R gene complex in maize 

30 encodes a protein that acts to regulate the production of anthocyanin 
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pigments in most seed and plant tissue. Maize strains can have one, or as 

many as four, R alleles which combine to regulate pigmentation in a 

developmental and tissue-specific manner. Thus, an R gene introduced into 

such cells will cause the expression of a red pigment and, if stably 

5 incorporated, can be visually scored as a red sector. If a maize line carries 

dominant alleles for genes encoding for the enzymatic intermediates in the 

anthocyanin biosynthetic pathway (C2, A1, A2, Bz1 and Bz2), but carries a 

recessive allele at the R locus, the transformation of any cell from that line 

with R will result in red pigment formation. Exemplary lines include 

10 Wisconsin 22 which contains the rg-Stadler allele and TR1 12, a K55 

derivative which is r-g, b, PI. Alternatively, any genotype of maize can be 

utilized if the C1 and R alleles are introduced together. 

b. Promoters and other sequences that influence gene 
expression 

15 Expression of nucleic acid encoding a selectable marker (or any 

heterologous nucleic acid) in a recipient cell can be regulated by a variety of 
promoters. Promoters for use in regulating transcription of DNA in cells, 
particularly plant cells, include, but are not limited to, the nopaline synthase 
(NOS) and octopine synthase (OCS) promoters; cauliflower mosaic virus 

20 (CaMV) 19S and 35S promoters, the light-inducible promoter from the small 
subunit of ribulose bis-phosphate carboxylase (ssRUBISCO, an abundant 
plant polypeptide), the mannopine synthase (MAS) promoter [see, e.g., 
Velten eta/. (1984) EMBO J. 3:2723-2730; and Velten and Schell (1985) 
Nuc. Acids Res. 73:6981 -6998], the rice actin promoter, the ubiquitin 

25 promoter, for example, from Z. mays (see e.g., PCT Application Publication 
No. WOOO/60061), Arabidopsis thaliana UBI 3 promoter [see e.g., Norris et 
a/. (1993) Plant Mol. Biol. 22:895-906] and the chemically inducible PR-1 
promoter from tobacco or Arabidopsis (see e.g., U.S. Patent No. 5,689,044). 
Selection of a suitable promoter may include several considerations, 

30 for example, recipient cell type (such as, for example, leaf epidermal cells. 
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mesophyll cells, root cortex cells), tissue- or organ-specific (e.g., roots, 
leaves or flowers) expression of genes linked to the promoter, and timing and 
level of expression (as may be influenced by constitutive vs. regulatable 
promoters and promoter strength). 
5 Additional sequences that may also be included in the nucleic acid 

containing a selectable marker include, but are not restricted to, transcription 
terminators and extraneous sequences to enhance expression such as 
introns. A variety of transcription terminators may be used which are 
responsible for termination of transcription beyond a coding region and 

10 correct polyadenylation. Appropriate transcription terminators include those 
that are known to function in plants such as, for example, the CaMV 35S 
terminator, the tml terminator, the nopaline synthase terminator and the pea 
rbcS E9 terminator, all of which may be used in both monocotyledonous and 
dicotyledonous plants. 

15 Numerous sequences have been found to enhance gene expression 

from within the transcriptional unit and these sequences can be used in 
conjunction with selectable marker and other genes to increase expression of 
the genes in plant cells. For example, various intron sequences such as 
introns of the maize Adhl gene have been shown to enhance expression, 

20 particularly in monocotyledonous cells. In addition, a number of non- 
translated leader sequences derived from viruses are also known to enhance 
exprssion, and these are particularly effective in dicotyledonous cells. 

c. Nucleic acids containing targeting sequences 
Development of a multicentric, particularly dicentric, chromosome 

25 typically is effected through integration of heterologous nucleic acid into 

heterochromatin, such as the pericentric heterochromatin, near or within the 
centromeric regions of chromosomes and/or into rDNA sequences. Thus, the 
development of artificial chromosomes may be facilitated by targeting the 
heterologous nucleic acid for integration into these regions, such as by 

30 introducing DNA, including, but not limited to, rDNA [e.g., rDNA intergenic 
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spacer sequence), satellite DNA, pericentric DNA and lambda phage DNA, 
into the recipient host cell. The targeting sequence may be introduced alone 
or with other nucleic acids, including but not limited to selectable markers. 
For example, a targeting sequence can be linked to a selectable marker. 
5 Examples of plant pericentric DNA and satellite DNA include, but are 

not limited to, pericentromeric sequences on tomato chromosome 6 [see, 
e.g.. Weide eta/. (1998) MoL Gen. Genet. 255:190-197], satellite DNA of 
soybean [see, e.g., Morgante eta/. (1997) Chromosome Res. 5:363-373; 
and Vahedian eta/. (1995) P/ant MoL Bio/. 25:857-862], pericentromeric 
10 DNA of Arab/dopsis tha/iana [see, e.g.. Tutois et al. (1999) Chromosome 
Res. 7:143-156], satellite DNA of arabidopsis thaliana (GenBank accession 
nos. AB033593 and X58104), pericentric DNA of the chickpea ICfcer 
ariet/num L.; see e.g., Staginnus et a/. (1999) Plant MoL Bio/. 39A037- 
1050], satellite DNA on the rye B chromosome [see, e.g., Langdon eta/. 
15 (2000) Genetics 754:869-884], subtelomeric satellite DNA from Si/ene 

/atifo/ia [see, e.g., Garrido-Ramos era/. (1999) Genome 42:442-446] and 
satellite DNA in the Saccharum complex [see, e.g., Alix eta/. (1998) 
Genome 4/:854-864]. 

Examples of rDNA targeting sequences include nucleic acids from 
20 plant and animal rDNA. Plant rDNA sequences include, but are not limited 
to, sequences contained in GENBANK Accession numbers D16103 |from 
rDNA of carrot (Daucus carota)], M23642 and M1 1585 [from rDNA encoding 
24S rRNA of rice {Oryza sativa)], M26461 [from from rDNA encoding 1 8S 
rRNA of rice {Oryza sativa)], M16845 [from rDNA encoding 17S, 5.8S and 
25 25S rRNA of rice (Oryza sativa)], X82780 and X82781 [from rDNA encoding 
5S rRNA of potato (So/anum tuberosum)], AJ1 31 1 61 , AJ1 31 162, 
AJ131 163, AJ131 164. AJ131 165, AJ131 166 and AJ131167 [from rDNA 
encoding 5S rRNA of tobacco (Nicotiana tabacum], L36494 and U31016 
through U31030 [from rDNA encoding 5S rRNA of barley (Hordeum 
30 spontaneum)], U31004 through U31015 and U31031 [from rDNA encoding 
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5S rRNA of barley (Hordeum bufbosumU, Z11759 [from rDNA encoding 5.8S 
rRNA of barley (Hordeum vulgare)], X16077 (from rDNA encoding 18S rRNA 
of Arabidopsis thaliana), M65137 (rDNA encoding 5S rRNA of A rabidops is 
thaliana), AJ232900 (from rDNA encoding 5.8S rRNA of Arabidopsis 
5 thaliana) and X52320 (from Arabidopsis thaiiana genes for 5.8S and 25S 
rRNA with an 1 8S rRNA fragment). 

Intergenic spacer regions of plant rDNA include, but are not limited to 
sequences contained in GENBANK Accession numbers S70723 (from the 5S 
rDNA of barley (Hordeum vu/gare)), AF013103 and X03989 (from maize 

10 (Zea mays)), X65489 (from potato (So/anum tuberosum)), X52265 (from 

tomato (Lycopersicon escu/entum)) , AF1 77418 (from Arabidopsis neglecta). 
AF 177421 and AF17422 (from Arabidopsis halleri), A71562, X15550, 
X52631, U43224, X52320, X52636 and X52637 (from Arabidopsis 
thaliana: see Gruendler et af. (1991) J. Mol. Biol. 22 1 :1209-1 222 and 

15 Gruendler et aL (1989) Nucleic Acids Res. 7 7:6395-6396), X54194 [from 
rice (Oryza sativa)] Y08422 and D76443 [from tobacco (Mcotiana 
tabacum)), AJ243073 [from wheat (Triticum boeoticum)] and X07841 [from 
wheat (Triticum aestivum)]. Sequences of intergenic spacer regions of plant 
rDNA further include sequences from rye [see Appels et a/. (1986) Can. J. 

20 Genet. Cytol. 25:673-6851, wheat [see Barker et a/. (1988) J. Mol. Biol. 

207:1-17 and Sardana and Flavell (1996) Genome 39:288-292], radish [see 
Delcasso-Tremousaygue et aL (1988) Eur. J. Biochem. 7 72:767-776], Vicia 
faba and Pisum sativum [see Kato et aL (1990) Plant Mol. Biol. 74:983-993], 
mung bean [see Gerstner et aL (1988) Genome 30:723-733; and Schiebel et 

25 aL (1989) Mol. Gen. Genet. 2 75:302-307], tomato [see Schmidt-Puchta et 

aL (1989) Plant Mol. Biol. 73:251-253], Hordeum bulbosum [see Procunier et 
al. (1990) Plant Mot. Biol. 75:661-663], Lens culinaris Medik., and other 
legume species [see Fernandez et al. (2000) Genome 43:597-603] and 
tobacco [see U.S. Patent Nos. 6,100,092 and 6,096,546 and PCT 

30 Application Publication No. W099/66058; Borysyuk et al. (1997) Plant Mol. 
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Biol. 35:655-660); Borysyuk et af. (2000) Nature Biotechnology 75:1303- 
1306]. 

Mammalian rDNA sequences include, but are not limited to r DNA of 
GENBANK accession no. X82564 and portions thereof, the DNA of 
5 GENBANK accession no. U13369 and portions thereof and DNA sequences 
provided in PCT Application Publication No. W097/40183 (particularly SEQ. 
ID. NOS. 18-24 of W097/40183). A particular vector for use in directing 
integration of heterologous nucleic acid into chromosomal rDNA is pTERPUD 
(see PCT Application Publication No. W097/40183). Satellite DNA 

lO sequences can also be used to direct the heterologous DNA to integrate into 
the pericentric heterochromatin. For example, vectors pTEMPUD and 
pHASPUD, which contain mouse and human satellite DNA, respectively (see 
PCT Application Publication No. W097/40183), are examples of vectors that 
may be used for introduction of heterologous nucleic acid into cells for de 

15 novo chromosome formation leading to artificial chromosomes. 

3. Methods for introduction of heterologous nucleic acids into host 
cells 

Any methods known in the art for introducing heterologous nucleic 
acids into host cells may be used in the methods of preparing artificial 

20 chromosomes. The particular method used may depend on the type of cell 
into which the heterologous nucleic acid is being transferred. For example, 
methods for the physical introduction of nucleic acids into plant cells, for 
example, protoplasts and plant cells in culture, include, but are not limited to 
polyethylene glycol (PEG)-mediated DNA uptake, electroporation, lipid- 

25 mediated delivery, including liposomes, calcium phosphate-mediated DNA 
uptake, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation and combinations of these methods, for example 
methods utilizing combinations of calcium phosphate and PEG for DNA 
uptake or methods utilizing a combination of electroporation, PEG and heat 

30 shock (see, e.g., U.S. Patent Nos. 5,231,019 and 5,453,367). Physical 
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methods such as these are known in the art and are effective in introducing 
DNA into a variety of dicotyledonous and monocotyledonous plants [see. 
e.g., Paszkowski era/. (1984) EMBO J. 3:2717-2722; Potrykus eta/. (1985) 
Mo/. Gen. Genet. 199: 169-1 77; Reich eta/. (1986) Biotechnology 4:1001- 
5 1004; Klein et a/. (1987) Nature 327:70-73; U.S. Patent No. 6,143,949; 

Paszkowski et a/. (1989) in Cell Culture and Somatic Cell Genetics of Plants, 
Vol. 6, Molecular Biology of Plant Nuclear Genes, eds. Schell, J and Vasil, 
L.K. Academic Publishers, San Diego, California, p. 52-68; and Frame et a/. 
(1994) P/ant J. 5:941-948]. 

10 In addition to these methods for the introduction of nucleic acids into 

plant cells based on physically, mechanically or chemically meidated 
processes, it is possible to introduce nucleic acids into plant cells by 
biological methods, such as those utilizing Agrobacterium . In this method, 
nucleic acid sequences located adjacent to T-DNA border repeats can be 

15 inserted into the genome of a plant cell, typically dicotyledonous plant cells, 
by utilizing the encoded function for DNA transfer found in the genus 
Agrobacterium. This method has also been shown to work for some 
monocotyledonous plant cells, such as rice cells. 

Any method for introducing nucleic acids into plant cells can be used 

20 in the generation of artificial chromosomes, provided the method is capable 

of introducing the nucleic acid into an amplifiable region of a chromosome, 

for example, heterochromatin, and particularly in close proximity to a 

megareplicator region of a plant chromosome. 

a. Agrobacterfummediated introduction of nucleic acids 
25 into plant cells 

Agrobacterium-medlaXed transformation is particularly well-suited for 

transformation of dicotyledons because of its high efficiency of 

transformation and its broad utility with many different species, including 

tobacco, tomato (see, e.g., European Patent Application no. O 249 432), 

30 sunflower, cotton (see, e.g., European Patent Application no. 0 317 51 1), 
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oilseed rape, potato, soybean, alfalfa and poplar (see, e.g., U.S. Patent No. 
4,795,855) (see also PCT Application Publication no. W087/07299 with 
respect to transformation of Brassica). Agrobacterium-medlatedi 
transformation has also been used to transfer nucleic acids into 
5 monocotyledonous plants. Agrobacterium-med\ated transformation of 

Ch/orophytum capense and Narcissus cv "Paperwhite" [see, e.g., Hooykaas- 
Van Slogteren eta/. (1984) Nature 31 7:763-764], corn and wheat [see, e.g., 
U.S. Patent Nos. 5,164,310, 5,187,073 and 5,177,010 and Mooney eta/. 
(1991) P/ant Ce//, Tissue, Organ Cu/ture 25:209-218], rice [see, e.g., Raineri 

10 era/. (1990) Bio/Tec hno/ogy 5:33-38 and Chan eta/. (1993) P/ant Mo/. Bio/. 
22:491-506] and barley [see, e.g., Tingay et a/. (1997) The P/ant J. 
11 1 1369-1 376 and Qureshi et aL (1998) Proc. 42nd Conference of 
Australian Society for Biochemistry and Molecular Biology, September 28- 
October 1, 1998, Adelaide Australia] has been reported. 

15 Agrobac terium-mediated delivery of nucleic acids is based on the 

capacity of certain Agrobacterium strains to introduce a part of their Ti 
(tumor-inducing) plasmid, i.e., the transforming DNA or T-DNA, into plant 
cells and to integrate this T-DNA into the genome of the cells. The part of 
the Ti plasmid that is transferred and integrated is delineated by specific DNA 

20 sequences, the left and right T-DNA border sequences. The natural T-DNA 
sequences between these border sequences can be replaced by foreign DNA 
[see, e.g., European Patent Publication 116 718 and Deblaere eta/. (1987) 
Meth. Enzymol. 153: 277-293]. 

When Agrobacterium is used for transformation, the heterologous 

25 nucleic acid being transferred typically is cloned into a plasmid that contains 
T-DNA border regions and is replicated independently of the Ti plasmid 
(referred to as the binary vector system) or the heterologous nucleic acid is 
inserted between the T-DNA borders of the Ti plasmid (referred to as the co- 
integrate method). In co-integrate methods, these vectors are be integrated 

30 into the Ti or Ri plasmid by homologous recombination owing to sequences 
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that are homologus to sequences within the T-DNA region of the Ti or Ri 
plasmid. The Ti or Ri plasmid also contains the vir region necessary for 
transfer of the T-DNA. 

Intermediate vectors cannot replicate in Agrobacteria. The 
5 intermediate vector can be transferred into Agrobacterium by means of a 
helper plasmid (conjugation, see Fraley eta/. (1983) Proc. Natl. Acad. Sci. 
USA SO:4803). This method, typically referred to as triparental mating, 
introduces the heterologous nucleic acid sequence into the bacterium and 
allows for selection of a homologous recombination event that produces the 

10 desired Agrobacterium genotype. The triparental mating procedure typically 
employs Escherichia co/i carrying the recombinant intermediate vector and a 
helper £. co/i strain which carries a plasmid that is able to mobilize the 
recombinant intermediate vector to the target Agrobacterium strain. A 
modified Ti or Ri plasmid is obtained from the transfer and selection process, 

1 5 which contains a heterologous nucleic acid sequence located within the T- 
DNA region. The resultant Agrobacterium strain is capable of transferring 
the heterologous nucleic acid to plant cells. 

Binary vectors can replicate both in E. co/i and Agrobacterium. They 
typically contain a selection marker gene and a linker or polylinker which are 

20 flanked by the right and left T-DNA border regions and can be transformed 
directly into Agrobacterium [see, e.g. , Hofgen and Wilmitzer (1988) Nuc. 
Acids. Res. 75:9877 and Holsters et al. (1978) Mo/. Gen. Genet. 753:181- 
187] or introduced through triparental mating. The Agrobacterium host cell 
contains a plasmid carrying a vir region needed for transfer of the T-DNA into 

25 a plant cell [see, e.g. , White in P/ant Biotechnology, eds. Kung, S. and 

Arntzen, C.J., Butterworth Publishers, Boston, Mass., (1989) p. 3-34 and 
Fraley in P/ant Biotechno/ogy, eds. Kung, S. and Arntzen, C.J., Butterworth 
Publishers, Boston, Mass., (1989) p. 395-407]. 

Agrobacterium-med'iated transformation typically involves the transfer 

30 of a binary vector carrying the heterologous nucleic acid of interest to an 
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appropriate Agrobacterium strain, which may depend on the complement of 
vir genes carried by the host Agrobacterium strain either on a co-resident Ti 
ptasmid or chromosomally (see, e.g., Uknes eta/. (1993) Plant Cell 5:159- 
169). The transfer of a recombinant binary vector to Agrobacterium is 
5 acomplished by a triparental mating procedure using Eschreichia co/i carrying 
the recombinant binary vector, a helper E. coli strain which carries a plasmid 
which is able to mobilize the recombinant binary vector to the target 
Agrobacterium strain. Alternatively, the recombinant binary vector can be 
transferred to Agrobacterium by DNA transformation (see, e.g., Hofgen & 

10 Willmitzer (1988) Nuc. Acids. Res. 75:9877). 

Many vectors are available for transfer of nucleic acids into 
Agrobacterium tumefaciens [see, e.g., Rogers et al. (1987) Methods in 
EnzymoL 153:253-277]. These typically carry at least one T-DNA border 
sequence and include vectors such as pBIN19 [see, e.g. , Bevan (1984) Nuc. 

15 Acids. Res. 72:871 1-8721]. Typical vectors suitable for Agrobacterium 

transformation include the binary vectors pCIB200 and pCIB2001, as well as 
the binary vector pCIBIO and hygromycin selection derivatives thereof (see, 
e.g. , U.S. Patent No. 5,639,949). Other vectors that can be employed are 
the pCambia vectors (see www.cambia.org), including, for example, 

20 pCambia 3300 and pCambia 1302 (GenBank Accession No. AF234298). 

A particularly useful Ti plasmid cassette vector for the transformation 
of dicotyledonous plants contains the enhanced CaMV35S promoter (EN35S) 
and the 3' end, including polyadenylation signals, of a soybean gene 
encoding the a subunit of /?-conglycinin. Between these two elements is a 

25 multilinker containing multiple restriction sites for the insertion of genes of 
interest (see, e.g. , U.S. Patent No. 6,023,013). The vector can contain a 
segment of pBR322 which provides an origin of replication in E. coli and a 
region for homologous recombination with the disarmed T-DNA in 
Agrobacterium strain ACO; the oriV region from the broad host range 

30 plasmid RK1; the streptomycin/spectinomycin resistance gene from Tn7; and 
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a chimeric NPTII gene, containing the CaMV35S promoter and the nopaline 
synthase (NOS) 3' end, which provides kanamycin resistance in transformed 
plant cells. Optionally, the enhanced CaMV35S promoter may be replaced 
with the 1.5 kb mannopine synthase (MAS) promoter (see, e.g., Velton era/. 
5 (1984) EMBO J. 3:2723-2730). After incorporation of a DNA construct into 
the vector, it is introduced into A. tumefaciens strain ACO which contains a 
disarmed Ti plasmid. Cointegrate Ti plasmid vectors are selected and 
subsequentally may be used to transform a dicotyledenous plant. 
Transformation of the target plant species by recombinant 

10 Agrobacterium usually involves co-cuitivation of the Agrobacterium with 
explants from the plant and follows published protocols. Methods of 
inoculation of the plant tissue vary depending upon the plant species and the 
Agrobacterium delivery system. The plant tissue can be either protoplast, 
callus or organ tissue, depending on the plant species. A widely used 

15 approach is the leaf disc procedure which can be performed with any tissue 
explant that provides a good source for initiation of whole plant 
differentiation (see, e.g. , Horsch et at. in Plant Molecular Biology Manual A5, 
Kluwer Academic Publishers, Dordrecht (1988) p. 1-9 and U.S. Patent No. 
6,136,320). The addition of nurse tissue may be desirable under certain 

20 conditions. There are multiple choices of Agrobacterium strains (including, 
but not limited to, A. tumefaciens and A. rhizogenes) and plasmid 
construction strategies that can be used to optimize genetic transformation 
of plants. Transformed tissue carrying an antibiotic or herbicide resistance 
marker present between the binary plasmid and T-DNA borders can be 

25 regenerated on selectable medium. 

A. tumefaciens ACO is a disarmed strain similar to pTiB6SE (see 
Fraley et al. (1985) Bio/Technology 5:629-635). For construction of ACO, 
the starting Agrobacterium strain was A208 which contains a nopaline-type 
Ti plasmid. The Ti plasmid was disarmed in a manner similar to that 

30 described by Fraley et al. (1985) Bio/Technology 3:629-635) so that 
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essentially all of the native T-DNA was removed except for the left border 
and a few hundred base pairs of T-DNA inside the left border. The remainder 
of the T-DNA extending to a point just beyond the right border was replaced 
with a piece of DNA including (from left to right) a segment of pBR322, the 
5 oriV region from plasmid RK2, and the kanamycin resistance gene from 
Tn601. The pBR322 and oriV segments are similar to these segments and 
provide a region of homology for cointegrate formation (see U.S. Patent No. 
6,023,013). Another useful strain of Agrobacterium is A. tumefaciens strain 
GV3101/pMP90 [see, e.g., Koncz and Schell (1986) Mo/. Gen. Genet. 

10 204:383-3961. 

Advances in Agrobacterium- mediated transfer allow introduction of 
larger segments of nucleic acids [see, e.g., Hamilton (1997) Gene 4:200(1- 
2):107-116; Hamilton eta/. (1996) Proc. Natl. Acad. Sci. U.S.A. 33:9975- 
9979; Liu eta/. (1999) Proc. Nat/. Acad. Sci. U.S.A. 36:6535-6540]. The 

15 vectors used in these methods are designed to have the characteristics of 
both bacterial artificial chromosomes (BACs) and binary vectors for 
Agrobacterium-med'iated transformation. Therefore, somewhat larger DNA 
fragments cloned in the T-DNA region can be transferred into a plant genome 
by Agrobacterium. Binary bacterial artificial chromosome (BIBAC) vector 

20 BIBAC2 (see U.S. Patent No. 5,733,744; available from the Plant Science 
Center, Cornell University) and the transformation-competent bacterial 
artificial chromosome (TAC) vector pYLTAC7 (available from the Plant Cell 
Bank of the RIKEN Gene Bank, Tsukuba, Japan) are examples of the types of 
vectors that may be used in transferring larger segments of nucleic acids, 

25 particularly heterologous nucleic acids containing targeting and/or selectable 
marker sequences as described herein, into plants via Agrobacterium- 
mediated DNA transfer processes. 

Introduction of heterologous nucleic acids into plant cells without the 
use of Agrobacterium circumvents the requirements for T-DNA sequences in 

30 the transformation vector and consequently vectors lacking these sequences 
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can be utilized in addition to vectors containing T-DNA sequences. 
Techniques for nucleic acid transfer that do not rely on Agrobacterium 
include transformation via particle bombardment, direct DNA uptake {e.g., 
PEG, lipids, electroporation) and mechanical methods such as microinjection 
5 or silicon "whiskers". The choice of vector that may be used in introduction 
of heterologous nucleic acids into plant cells can involve largely on the 
preferred selection for the species being transformed. Typical vectors 
suitable for transformation without Agrobacterium include pCIB3064, 
pSOG19 and pSOG35 (see, e.g., U.S. Patent No. 5,639,949), or common 
10 plasmid, phage or cosmid vectors. 

b. Direct DNA Uptake 
Introduction of heterologous nucleic acids into plant cells may be 
achieved using a variety of methods that facilitate direct DNA uptake, 
including calcium phosphate precipitation, polyethylene glycol (PEG) 
15 treatment, electroporation, and combinations thereof [see, e.g., Potrykus et 
at. (1935) Mot. Gen. Genet. 793:133; Lorz eta/. (1985) Mot. Gen. Genet. 
799:113; Fromm et at. (1985) Proc. Natt. Acad. Set. U.S.A. £2:5824-5828; 
Uchimiya et at. (1986) Mot. Gen. Genet. 204:204; Callis et at. (1987) Genes 
Dev. /:1183-20OO; Callis et at. (1987) Nuc. Acids Res. /5: 5823-5831 ; 
20 Marcotte et at. (1988) Nature 555:454, Toriyama et at. (1988) 

Bio/Technology 6: 1 072-1 074; Haim et at. (1985) Mot. Gen. Genet. 799:131- 
168; Deshayes et at. (1985) EMBO J. 4:2731-2737; Krens et at. (1982) 
Nature 296:11-1 '4; Crossway et at. (1986) Mot. Gen. Genet. 20:179]. 

Typically, plant protoplasts are used for direct DNA uptake, or in some 
25 instances plant t issue that has been treated to remove a portion or the 

majority of the cell wall (see, e.g., PCT Publication No. W093/21335 and 
U.S. Patent No. 5,472,869). Removal of the cell wall is believed to facilitate 
entry of DNA into plant cells, although in some instances electroporation may 
be used to introduce DNA into specialized plant cells, e.g. , electroporation of 
30 pollen, without first removing the cell wall. 
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Techniques for the preparation of callus and protoplasts from maize, 

transformation of protoplasts using PEG or electroporation, and the 

regeneration of maize plants from transformed protoplasts are found, for 

example, in European Patent Application nos. O 292 435 and 0 392 225 and 

5 PCT Application Publication no. WO93/07278. Transformation of rice can 

also be undertaken by direct gene transfer techniques utilizing protoplasts 

[see, e.g.. Zhang eta/. (1988) Plant Cell Rep. 7:379-384; Shimamoto et al. 

(1989) Nature 338:274-277; Datta et al. (1990) Biotechnology 5:736-740]. 

The regeneration of fertile transgenic barley by direct DNA transfer to 

10 protoplasts is described, for example, by Funatsuki et al. [(1995) Theor. 

Appl. Genet. 37:707-712]. Other plant species, including tobacco and 

Arabidopsis, may also serve as sources of protoplasts for use in introduction 

of heterologous nucleic acids into plant cells. 

c. Particle bombardment-mediated introduction of nucleic 
15 acids into plant cells 

Microprojectile bombardment of plant cells can be an effective method 

for the introduction of nucleic acids into plant cells. In these methods, 

nucleic acids are carried through the cell wall and into the cytoplasm on the 

surface of small, typically metal, particles [see, e.g., Klein era/. (1987) 

20 Nature 327:70; Klein et al. (1988) Proc. Natl. Acad. Sci. U.S.A. 55:8502- 
8505, Klein et al. in Progress in Plant Cellular and Molecular Biology, eds. 
Nijkamp, H.J.J. , Van der Plas, J.H.W., and Van Aartrijk, J., Kluwer 
Academic Publishers, Dordrecht, (1988), p. 56-66; Seki et al. (1999) Mol. 
Biotechnol. 7 7:251-255; and McCabe et al. (1988) Bio/Technology 5:923- 

25 926]. Particles may be coated with nucleic acids and delivered into cells by 
a propelling force. Exemplary particles include those containing tungsten, 
gold or plantinum, as well as magnesium sulfate crystals. The metal 
particles can penetrate through several layers of cells and thus allow the 
transformation of cells within tissue explants. 
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In an illustrative embodiment (see, e.g. , U.S. Patent No. 6,023,013] of 
a method for delivering nucleic acids into plant cells, e.g., maize cells, by 
acceleration, a Biolistics Particle Delivery System may be used to propel 
particles coated with DNA or cells through a screen, such as a stainless steel 
5 or Nytex screen, onto a filter surface covered with plant (e.g., corn) cells 
cultured in suspension. The screen disperses the particles so that they are 
not delivered to the recipient cells in large aggregates. The intervening 
screen between the projectile apparatus and the cells to be bombarded may 
reduce the size of projectile aggregates and may contribute to a higher 

10 frequency of transformation by reducing damage inflicted on the recipient 
cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 
filters or solid culture medium. Alternatively, immature embryos or other 
target cells may be arranged on solid culture medium. The cells to be 

15 bombarded are typically positioned at an appropriate distance below the 

macroprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 

The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 

20 transformants. Both the physical and biological parameters for bombardment 
can be important in this technology. Physical factors include those that 
involve manipulating the DNA/microprojectile precipitate or those that affect 
the flight and velocity of either the macro- or microprojectiles. Biological 
factors include all steps involved in manipulation of cells before and 

25 immediately after bombardment, the osmotic adjustment of target cells to 

help alleviate the trauma associated with bombardment, and also the nature 
of the transforming nucleic acid, such as linearized DNA or intact supercoiled 
plasmids. 

Physical parameters that may be adjusted include gap distance, flight 
30 distance, tissue distance and helium pressure. In addition, transformation 
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may be optimized by adjusting the osmotic state, tissue hydration and 

subculture stage or cell cycle of the recipient cells. 

Techniques for transformation of A188-derived maize line using 

particle bombardment are desribed in Gordon-Kamm et al. [(1990) Plant Cell 

5 2:603-618] and Fromm et al. [(1990) Biotechnology 5:833-839]. 

Transformation of rice may also be accomplished via particle bombardment 

[see, e.g., Christou et al. (1991) Biotechnology 5:957-962]. Particle 

bombardment may also be used to transform wheat [see, e.g., Vasil et al. 

(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

10 term regenerate callus; and Weeks et al. (1993) Plant Physiol. 702:1077- 

1084 for transformation of wheat using particle bombardment of immature 

embryos and immature embryo-derived callus]. The production of transgenic 

barley using bombardment methods is described, for example, by Koprek et 

al. [(1996) Plant Sci. 7/5:79-91]. 

15 d. Electroporation-mediated introduction of nucleic acids 

into plant cells 

The application of brief, high-voltage electric pulses to a variety of 
animal and plant cells leads to the formation of nanometer-sized pores in the 
plasma membrane. Nucleic acids are taken directly into the cell cytoplasm 

20 either through these pores or as a consequence of the redistribution of 
membrane components that accompanies closure of the pores. 
Electroporation can be extremely efficient and can be used both for transient 
expression of cloned genes and for the establishment of cell lines that carry 
integrated copies of the gene of interest. 

25 Certain cell wall-degrading enzymes, such as pectin-degrading 

enzymes, may be employed to render the target recipient cells more 
susceptible to transformation by electroporation than untreated cells. 
Alternatively, recipient cells may be more susceptible to transformation by 
mechanical wounding. To effect transformation by electroporation, friable 

30 tissues such as a suspension culture of cells or embryonic callus may be 
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used or immature embryos or other organized tissues may be directly 
transformed [see, e.g., Fromm era/. (1986) Nature 373:791-793; and 
Neuman efa/. (1 982) EMBO J. 7:841-845]. 

e. Microinjection-mediated introduction of nucleic acids into 
5 plant cells 

In microinjection techniques, nucleic acids are mechanically injected 

directly into cells using very small micropipettes. For example, microinjection 

of protoplast cells with foreign DNA for transformation of plant cells has 

been reported for barley and tobacco [see, e.g. , Holm eta/. (20O0) 

10 Transgenic Res. 3:21-32 and Schnorf eta/. Transgenic Res. 7:23-30]. 

f . Lipid-mediated introduction of nucleic acids into plant 
cells 

In lipid-mediated transfer, nucleic acids are contacted with lipids 
and/or encapsulated in lipid-containing structures, including but not limited to 

15 liposomes, and the liposome-containing nucleic acids are fused with plant 

protoplasts. The fusion can occur in the presence or absence of a fusogen, 
such as PEG. Lipid-mediated transformation of plant protoplasts has been 
reported [see e.g. , Fraley and Papahadjopoulos (1982) Curr. Top. Microbiol. 
Immunol. 36:171-191; Deshayes et al. (1985) EMBO J. 4:2731-2737 and 

20 Spoerlein and Koop (1991) Theor. Appl. Genetics 83:1-B]. 

g. Other methods of introduction of nucleic acids into plant 
cells 

Other methods to physically introduce nucleic acid into plant cells may 
be used, including silicon carbide fibers ("whiskers") that are used to pierce 
25 plant cell walls thereby facilitating nucleic acid uptake, the use of sound 
waves to introduce holes in plant cell membranes to facilitate nucleic acid 
uptake (e.g., sonoporation) and the use of laser beams to open holes in cell 
membranes facilitating the entry of nucleic acids (e.g., laser poration). 

Nucleic acids may also be imbibed by hydrating plant tissue, providing 
30 another method for nucleic acid uptake into plant cells [see, e.g., Simon 

(1974) New Phytologist 57:377-420]. For example, nucleic acids may be 
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taken into cereal and legume seed embryos by inhibition [see, e.g. . Toepfer 

eta/, (1989) The Plant Cell 7:133-139]. 

4. Treatment of cells into which heterologous nucleic acids have 
been introduced 

5 Cells into which heterologous nucleic acids have been introduced may 

be analyzed for de novo formation of artificial chromosomes described herein 
such as may result from amplification of chromosomal segments occurring in 
connection with integration of heterologous nucleic acids into chromosomes. 
Typically, amplification occurs over multiple generations of cell division 

10 leading to the formation of detectable changes in chromosome structure. 
Therefore, transfected cells are typically cultured through multiple cell 
divisions, from about 5 to about 60, or about 5 to about 55, or about 10 to 
about 55, or about 25 to about 55, or about 35 to about 55 cell divisions 
following introduction of nucleic acid into a cell. Artificial chromosomes 

15 may, however, appear after only about 5 to about 1 5 or about 10 to about 

15 cell divisions. Cells into which heterologous nucleic have been introduced 
may be treated in a variety of ways prior to or during analysis thereof for the 
presence of artificial chromosomes. 

For example, cells into which nucleic acid encoding a selectable 

20 marker required for growth in the presence of a selection agent has been 
transferred can be treated as the exemplified cells herein to facilitate 
generation of multicentric chromosomes, and fragmentation thereof, and/or 
the generation of artificial chromosomes. The cells may be grown in the 
presence of an appropriate concentration of selection agent, which may be 

25 determined empirically by growing untransfected cells in varying 

concentrations of the agent and identifying concentrations sufficient to 
prevent cell growth and/or facilitate amplification of chromosomal segments. 
Transfected cells may be grown in selective media for numerous generations 
and cell lines can be established that contain the introduced nucleic acid- 

30 The concentration of selection agent may also be increased over several 
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generations to promote amplification of a region of a chromosome into which 
heterologous nucleic acid integrated. Transfected cells may also be treated 
to destabilize the chromosomes to facilitate generation and fragmentation of 
a multicentric, typically dicentric, chromosome. 
5 Additional heterologous nucleic acid, e.g. , nucleic acid encoding a 

selectable marker, may also be introduced into the transfected cells to 
facilitate amplification of chromosomal segments, such as the pericentric 
heterochromatin, contained in, for example, a fragment released from a 
multicentric chromosome {e.g. , a formerly dicentric chromosome), and 

10 generation of a heterochromatic artificial chromosome. The resulting 

transformed cells can then be grown in the presence of a selection agent, 
which may be a second agent (if the heterologous nucleic acid introduced 
into the transfected cells encodes a selectable marker different from any 
selectable marker encoded by heterologous nucleic acid initially transferred 

15 into the original host cells), with or without the first selection agent. 

Cells into which nucleic acids have been introduced may also be 
subjected to cell sorting. For example, protoplasts may be prepared from 
transfected plant cells or calli and subjected to sorting. If the sorting is 
conducted prior to chromosomal analysis of the cells for the presence of 

20 artificial chromosomes, it provides a population of transfected cells that may 
be enriched for artificial chromosomes and thus facilitates the subsequent 
chromosomal analysis of the cells. 

The sorting is based on the presence of a detectable marker in the 
cells, as provided for by the introduced nucleic acid, which can provide the 

25 basis for isolating such cells from cells that do not contain the heterologous 
nucleic acid. For example, the nucleic acid introduced into the plant cells 
may contain nucleic acid encoding a fluorescent protein, such as a green, red 
or blue fluorescent protein, which may be used for selection, by flow 
cytometry and other methods, of recipient cells that have taken up and 

30 express the nucleic acid at readily detected levels. 
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In an exemplary protocol, GFP fluorescence of transfected cell cultures 
may be monitored visually during culture using an inverted microscope 
equipped with epifluorescence illumination (Axiovert 25; Zeiss, (North York 
ON) and #41017 Endow GFP filter set (Chroma Technologies, Brattleboro, 
5 VT). Enrichment of GFP expressing populations can be carried out as 

. follows. Cell sorting may be carried out, for example, using a FACS Vantage 
flow cytometer (Becton Dickinson Immunocytometry Systems, San Jose, 
CA) equipped with turbo-sort option and 2 Innova 306 lasers (Coherent, Palo 
Alto CA). For cell sorting a 70 jam nozzle can be used. The buffer can be 

10 changed to PBS (maintained at 20 p.s.L). GFP may be excited with a 488 
nm laser beam and excitation detected in FL1 using a 500 EFLP filter. 
Forward and side scattering can be adjusted to select for viable cells. Gating 
parameters may be adjusted using untransfected cells as negative control 
and GFP CHO cells as positive control. 

15 For the first round of sorting, transfected cells may be harvested post- 

transfection (e.g., about 7-14 days post-transfection), converted to 
protoplasts, resuspended in about 10 ml of growth medium and sorted for 
GFP-expressing populations using parameters described above. GFP-positive 
cells may be dispensed into a volume of about 5-10 ml of protoplast medium 

20 while non-expressing cells are directed to waste. The expressing cells may 

be cultured. Plant cells or calli can then be analyzed, for fluorescence in-situ 

hybridization screening. 

5. Analysis of transformed cells and identification and 
manipulation of artificial chromosomes 

25 Cells into which nucleic acids have been introduced, and which may 

or may not have been further treated as described herein, may be analyzed 
for indications of amplification of chromosomal segments, the presence of 
structures that may arise in connection with amplification and de novo 
artificial chromosome formation and/or the presence of desired artificial 

30 chromosomes as described herein. Analysis of the cells typically involves 



WO 2002/096923 



PCT/US2002/01 745 1 



-78- 

methods of visualizing chromosome structure, including, but not limited to, G- 
and C-banding, PGR, Southern blotting and FISH analyses, using techniques 
described herein and/or known to those of skill in the art. Such analyses can 
employ specific labelling of particular nucleic acids, such as satellite DNA 
5 sequences, heterochromatin, rDNA sequences and heterologous nucleic acid 
sequences, that may be subject to amplification. During analysis of 
transfected cells, a change in chromosome number and/or the appearance of 
distinctive, for example, by increased segmentation arising from amplification 
of repeat units, chromosomal structures will also assist in identification of 

lO cells containing artificial chromosomes. The following description of events 
and structures that may be observed in analyzing cells for evidence of 
chromosomal amplification and/or the presence of artificial chromosomes is 
intended to be illustrative of the observations and considerations that may 
occur in the analysis of cells of any type, including mammalian and plant 

15 cells. It should be recognized that numerous types of structures may be 

formed during amplification of chromosomal segments and treatment of the 
cells. Additional, yet related, structures and variations of these structures 
are contemplated herein and are recognizable based on the descriptions and 
teachings of the generation and identification of artificial chromosomes 

20 presented herein. Each structure can be further manipulated, for example 
using procedures described herein, to derive additional chromosomal 
structures and compositions. 

Typically, de novo centromere formation occurs in cells upon 
integration of heterologous nucleic acids into the cell chromosomes and 

25 amplification of chromosomal and heterologous nucleic acids. The 

integration and amplification that gives rise to de novo centromere formation 
typically occurs at the centromeric region of the short arm of a chromosome, 
typically an acrocentric chromosome. By employing methods such as 
chromosome-staining methods, including FISH and G-and C-banding, it may 

30 be possible to identify a chromosome at which the process occurs. 
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The amplification can lead to the formation of multicentric, typically 
dicentric, chromosomes. Because of the presence of two or more 
functionally active centromeres on the same chromosome, regular breakages 
occur between the centromeres. Such specific chromosome breakages can 
5 give rise to the appearance of a chromosome fragment carrying a neo- 
centromere. The neo-centromere may be found on a mjnichromosome (neo- 
minichromosome), while a formerly dicentric chromosome may carry traces 
of the heterologous nucleic acid. 

a. The neo-minichromosome 

lO Breakage of a dicentric chromosome between the two functional 

centromeres can form at least two chromosomes, for example, a so-called 
minichromosome, and a formerly dicentric chromosome. Treatment of cells 
containing a dicentric chromosome, such as, for example, recloning, 
treatment with agents that destabilize the chromosomes, e.g., BrdU, and/or 

15 culturing under selective conditions, may facilitate breakage of the dicentric 
chromosome. Selection of transformed cells can yield cell lines containing a 
stable neo-minichromosome. The breakage of a multicentric, typically 
dicentric, chromosome in transformed cells, which separates the neo- 
centromere from the remainder of the endogenous chromosome, may occur, 

20 for example, in the G-band positive heterologous nucleic acid region as is 

suggested if traces of the heterologous nucleic acid sequences at the broken 
end of the formerly dicentric chromosome are observed. 

Multiple E-type amplification (amplification of euchromatin) may form a 
neo-chromosome, which separates from the remainder of the dicentric 

25 chromosome through a specific breakage between the centromeres of the 

dicentric chromosome. Inverted duplication of the fragment bearing the neo- 
centromere can result in the formation of a stable neo-minichromosome. The 
minichromosome is generally about at least 20-30 Mb in size. 

The presence of inverted chromosome segments can be associated 

30 with the chromosomes formed de novo at the centromeric region of a 
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chromosome. During the formation of the neo-minichromosome, the event 
leading to the stabilization of the distal segment of the chromosome that 
bears the duplicated neo-centromere may be the formation of its inverted 
duplicate. 

5 Although the neo-minichromosome typically carries only one functional 

centromere, both ends of the minichromosome can be heterochromatic, 
carrying, for example, satellite DNA sequences as discernable by in situ 
hybridization. Comparison of the G-band pattern of a chromosome fragment 
carrying the neo-centromere with that of a stable neo-minichromosome, can 

10 indicate that the neo-minichromosome is an inverted duplicate of the 
chromosome fragment that bears the neo-centromere. 

Cells containing a de /7ovo-formed minichromosome, which contains 
multiple repeats of the heterologous nucleic acids, can be used as recipient 
cells in cell transfection. Donor nucleic acids, such as heterologous nucleic 

15 acids containing DNA encoding a desired protein and DNA encoding a 

second selectable marker, can be introduced into the cells and integrated into 
the de /?oi/o-formed minichromosomes. To facilitate integration into the de 
- /?ovo-formed minichromosomes, the heterologous DNA may also contain 
sequences that are homologous to nucleic acids already present in the 

20 minichromosomes, which can, through homologous recombination, provide 
targeted integration into the minichromosome. Nucleic acids can also be 
integrated into the minichromosome through the use of site-specific 
recombinases by producing minichromosomes containing site-specific 
recombination sites as described herein. Integration can be verified by in situ 

25 hybridization and Southern blot analyses. Transcription and translation of 
heterologous DNA can be confirmed by primer extension, immunoblot 
analyses and reporter gene assays, if a reporter gene has been included in 
the heterologous DNA, using, for example, appropriate nucleic acid probes 
and/or product-specific antibodies. 
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The resulting engineered minichromosome that contains the heterolo- 
gous DNA can also be transferred, for example by cell fusion, into a recipient 
cell line to further verify correct expression of the heterologous DNA. 
Following production of the cells, metaphase chromosomes can be obtained, 
5 such as by addition of colchicine, and the minichromosomes purified using 
methods as described herein. The resulting minichromosomes can be used 
for delivery to specific cells of interest using any known method or methods 
for transferring heterologous nucleic acids into cells, particularly plant cells, 
and/or methods described herein. 

10 Thus, the neo-minichromosome is stably maintained in cells, replicates 

autonomously, and permits the persistent, long-term expression of genes 
under non-selective culture conditions, and in a whole, intact, regenerated 
plant. It also can contain megabases of heterologous known DNA that can 
serve as target sites for homologous recombination and integration of DNA 

15 of interest. The neo-minichromosome is, thus, a vector for the delivery and 
expression of nucleic acids to cells. 

Cell lines that contain artificial chromosomes, such as the 
minichromosome, the neo-chromosome, and the heterochromatic artificial 
chromosomes, are a convenient source of these chromosomes and can be 

20 manipulated, such as by cell fusion or production of microcells for fusion 
with selected cell lines, to deliver the chromosome of interest into a 
multiplicity of cell lines, including cells from a variety of different plant 
species. 

b. Heterochromatin-containing and predominantly 
25 heterochromatic artificial chromosomes 

Manipulation of cells containing a fragment released upon breakage of 

the dicentric chromosome (e.g., a formerly dicentric chromosome), for 

example, by introducing additional heterologous nucleic acids, including, for 

example, DNA encoding a second selectable marker and growth under 

30 selective conditions, can yield heterochromatic structures. Included among 
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such structures are compositions referred to as sausage chromosomes and 
megachromosomes. For example, a formerly dicentric chromosome may 
translocate to the end of another chromosome, such as an acrocentric 
chromosome. Additional heterologous nucleic acids added to cells containing 
5 a formerly dicentric chromosome can integrate into the pericentric 

heterochromatin of the formerly dicentric chromosome and be amplified 
several times with megabases of pericentric heterochromatic satellite DNA 
sequences forming a "sausage" chromosome carrying a newly formed 
heterochromatic chromosome arm. The size of this heterochromatic arm can 

10 vary, for example, between —150 and ~800 Mb in individual metaphases. 
The chromosome arm can contain four to five satellite segments rich in 
satellite DNA, and evenly spaced integrated heterologous "foreign" DNA 
sequences. At the end of the compact heterochromatic arm of the sausage 
chromosome, a less condensed euchromatic terminal segment may be 

15 observed. By capturing a euchromatic terminal segment, this new 

chromosome arm is stabilized in the form of the "sausage" chromosome. In 
subclones of sausage chromosome-containing cell lines, the heterochromatic 
arm of the sausage chromosome may become unstable and show continuous 
intrachromosomal growth, particularly after treatment with BrdU and/or drug 

20 selection to induce further H-type amplification. In extreme cases, the 

amplified chromosome arm can exceed 500 Mb or even 1000 Mb in size 
(gigachromosome). Thus, the gigachromsome is a structure in which a 
heterochromatic arm has amplified but not broken off from a euchromatic 
arm. 

25 In situ hybridization with, for example, biotin-labeled subfragments of 

the added heterologous nucleic acids may show a hybridization signal only in 
the heterochromatic arm of the sausage chromosome, indicating that the 
heterologous nucleic acid sequences are localized in the pericentric 
heterochromatin. 



WO 2002/096923 PCT/US2002/0 1 7451 



-83- 

Gene expression, however, may be possible in the heterochromatic 
environment of a sausage chromosome. The level of heterologous gene 
expression may be determined by Northern hybridization with a subfragment 
of the selectable marker gene. Reporter genes included in heterologous 
5 nucleic acids also provide a readily detectable product for use in evaluating 
gene expression in a sausage or other heterochromatic or predominantly 
heterochromomatic chromosome. Southern hybridization of DNA isolated 
from subclones of sausage chromosome-containing cells with subfragments 
of reporter (and selectable marker) genes can show a close correlation 
lO between the intensity of hybridization and the length of the sausage 
chromosome. 

Cell lines containing sausage chromosomes can be manipulated to 
yield additional heterochromatic structures and artificial chromosomes, 
including, for example, an artificial chromosome referred to as a 
15 megachromosome. Such manipulation includes fusion of the cell line with 
other cells and growth in the presence of one or more selection agents 
and/or BrdU. 

Cells with a structure, such as the sausage chromosome, can be 
selected and fused with a second cell line, including other plant and non- 
20 plant species [see, e.g., Dudits et al. (1976) Heriditas £2:121-123 for the 
fusion of human cells with carrot protoplasts and Wiegand et aL (1987) J. 
Cell. Set. (Pt. 2^:145-149 for laser-induced fusion of plant protoplasts with 
mammalian cells] to eliminate other chromosomes that are not of interest. 
Structures such as sausage chromosomes formed during this process may be 
25 further manipulated, for example, by treating the cells with agents that 

destabilize chromosomes, e.g. , BrdU, so that the heterochromatic arm forms 
a chromosome that is substantially heterochromatic (e.g., a 
megachromosome). Structures such as the gigachromosome in which the 
heterochromatic arm has amplified but not broken off from the euchromatic 
30 arm, may also be observed. Further manipulation, such as fusions and 
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growth in selective conditions and/or BrdU treatment or other such 
treatment, can lead to fragmentation of the megachromosome to form 
smaller chromosomes that have the amplicon as the basic repeating unit. 

If a cell with a sausage chromosome is selected, it can be treated with 
5 an agent, such as BrdU, that destabilizes the chromosome so that the 
heterochromatic arm forms a chromosome that is substantially 
heterochromatic (e.g. , a megachromosome). Prior to treating the cell with 
BrdU, it can be fused with another cell line carrying chromosomes of another 
species, in order to eliminate chromosomes of the original host cell and 

10 obtain a cell in which the only chromosome from the host cell is the sausage 
chromosome. The resulting hybrid cells can be grown in the presence of 
multiple selection agents to select for those that carry the sausage 
chromosome. In situ hybridization with chromosome painting probes that 
detect chromosomes of both the host cell species and the species of cell to 

1 5 which the host cell was fused can provide an indication of the chromosomal 
make up of the hybrid cells. 

Cell lines containing a sausage chromosome can be treated with a 
destabilizing agent, such as BrdU, followed by growth in selective medium 
and retreatment with BrdU. The BrdU treatments appear to destabilize the 

20 genome, resulting in a change in the sausage chromosome as well. A cell 
population in which a further amplification has occurred will arise. In 
addition to the heterochromatic arm (which may, for example, be ~ 100150 
Mb) of the sausage chromosome, an extra centromere and another (for 
example, — 150-250 Mb) heterochromatic chromosome arm may be formed. 

25 By the acquisition of another euchromatic terminal segment, a new 
submetacentric chromosome {e.g., megachromosome) can form. 

Megachromosomes may also be produced through regrowth and 
establishment of sausage chromosome-containing cells in selective medium. 
Repeated BrdU treatment can produce cell lines that have a dwarf 

30 megachromosome (for example, about 1 50-20O Mb), a truncated 
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megachromosome (for example, about 90-1 20 Mb), or a micro- 
megachromosome (for example, about 50-90 Mb). Cell lines containing 
smaller truncated megachromosomes can be used to generate even smaller 
megachromosomes, e.g. . ~ 10-30 Mb in size. This may be accomplished, 
5 for example, by breakage and fragmentation of a micro-megachromosome 
through exposing the cells to X-ray irradiation, BrdU or telornere-directed in 
vivo chromosome fragmentation. 

Apart from the euchromatic terminal segments and the integrated 
foreign nucleic acid, the whole megachromosome, as well as other related 

10 types of predominantly heterochrornatic artificial chromosomes, is 

constitutive heterochromatin. This can be demonstrated by C-banding of the 
megachromosome, which results in positive staining characteristic of 
constitutive heterochromatin. It can contain tandem arrays of satellite DNA. 
In a particular example, satellite DNA blocks are organized into a giant 

15 palindrome (amplicon) carrying integrated exogenous nucleic acid sequences 
at each end. It is of course understood that the specific organization and 
size of each component can vary among species, and also the chromosome 
in which the amplification event initiates. 

In general, a clear segmentation may be observed in one or more arms 

20 of an amplification-based chromosome. For example, a megachromosome 
may contain building units that are amplicons of, for example, —30 Mb 
containing satellite DNA with the integrated "foreign" DNA sequences at 
both ends. The — 30 Mb amplicons may be composed of two ~ 1 5 Mb 
inverted doublets of —7.5 Mb satellite DNA blocks, which are separated 

25 from each other by a narrow band of non-satellite sequences. The wider 
non-satellite regions at the amplicon borders may contain integrated, 
exogenous (heterologous) nucleic acid, while any narrow bands of non- 
satellite DNA sequences within the amplicons may be integral parts of the 
pericentric heterochromatin of the host chromosomes. The sizes of the 

30 building units of a megachromosome or other amplification-based 
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chromosome may vary depending on the species of the host chromosome 
from which the artificial chromosome was generated. 

Further BrdU treatment can produce cell and/or calli that include cells 
with a truncated megachromosome. The megachromosome can be further 
5 fragmented in vivo using a chromosome fragmentation vector to ultimately 
produce a chromosome that comprises a smaller stable replicable unit, for 
example, about 1 5 Mb-60 Mb, containing one to four megareplicons. 

Apart from the euchromatic terminal segments, the whole 
megachromosome is heterochromatic, and has structural homogeneity. 

lO Therefore, artificial chromosomes such as the megachromosome offer a 

unique possibility for obtaining information about the amplification process, 
and for analyzing some basic characteristics of the pericentric constitutive 
heterochromatin, as a vector for heterologous DNA, and as a target for 
further fragmentation. 

15 C. Isolation of Artificial Chromosomes 

The artificial chomosomes provided herein can be isolated by any 
suitable method known to those of skill in the art. Also, methods are 
provided herein for effecting substantial purification, particularly of the 
artificial chromosomes. 

20 Artificial chromosomes, may be sorted from endogenous 

chromosomes using any suitable procedures, and typically involve isolating 
metaphase chromosomes, distinguishing the artificial chromosomes from the 
endogenous chromosomes, and separating the artificial chromosomes from 
endogenous chromosomes. Such procedures will generally include the 

25 following basic steps for animal cells and protoplasts: (1) culture of a 

sufficient number of cells (typically about 2 x 10 7 mitotic cells) to yield, 
preferably on the order of 1 x 10 s artificial chromosomes, (2) arrest of the 
cell cycle of the cells in a stage of mitosis, preferrably metaphase, using a 
mitotic arrest agent such as colchicine, (3) treatment of the cells, particularly 

30 by cell wall dissolution for plant cells and/or swelling of the cells in hypotonic 
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buffer, to increase susceptibility of the cells to disruption, (4) by application 
of physical force to disrupt the cells in the presence of isolation buffers for 
stabilization of the released chromosomes, (5) dispersal of chromosomes in 
the presence of isolation buffers for stabilization of free chromosomes, (6) 
5 separation of artificial chromosomes from endogenous chromosomes and 
(7) storage (and shipping if desired) of the isolated artificial chromosomes in 
appropriate buffers. Modifications and variations of the general procedure 
for isolation of artificial chromosomes, for example to accommodate different 
cell types with differing growth characteristics and requirements and to 

10 optimize the duration of mitotic block with arresting agents to obtain the 

desired balance of chromosome yield and level of debris, may be empirically 
determined (see Examples). 

Steps 1-5 relate to isolation of metaphase chromosomes. The 
separation of artificial from endogenous chromosomes (step 6) may be 

15 accomplished in a variety of ways. For example, the chromosomes may be 
stained with DNA-specific dyes such as Hoeschst 33258 and chromomycin 
A 3 and sorted into artificial chromosomes and endogenous chromosomes on 
the basis of dye content by employing fluorescence-activated cell sorting 
(FACS). 

20 Artificial chromosomes have been isolated by fluorescence-activated 

cell sorting (FACS). This method takes advantage of the nucleotide base 
content of the artificial chromosomes. In the case of predominantly 
heterochromatic artificial chromosomes, by virtue of their high 
heterochromatic DNA content, they will differ from any other chromosomes 

25 in a cell. In a particular embodiment, metaphase chromosomes are isolated 
and stained with base-specific dyes, such as Hoechst 33258 and 
chromomycin A3. Fluorescence-activated cell sorting will separate artificial 
chromosomes from the endogenous chromosomes. A dual-laser cell sorter 
(such as, for example, a FACS Vantage Becton Dickinson Immunocytometry 

30 Systems) in which two lasers were set to excite the dyes separately, allowed 
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a bivariate analysis of the chromosomes by base-pair composition and size. 
Ceils containing such artificial chromosomes can be similarly sorted. 

Preparative amounts of artificial chromosomes (for example, 5 x 10 4 - 
5 x 10 7 chromosomes/ml) at a purity of 95% or higher can be obtained. The 
5 resulting artificial chromosomes are used for delivery to cells by methods 
such as, for example, microinjection, liposome-mediated transfer, and 
electroporation. 

Additional methods provided herein for isolation of artificial 
chromosomes from endogenous chromosomes include procedures that are 
10 particularly well suited for large-scale isolation of artificial chromosomes. In 
these methods, the size and density differences between artificial 
chromosomes and endogenous chromosomes are exploited to effect 
separation of these two types of chromosomes. To facilitate larger scale 
isolation of the artificial chromosomes, different separation techiniques may 
15 be employed such as swinging bucket centrif ugation (to effect separation 

based on chromosome size and density) [see, e.g., Mendelsohn et aL (1968) 
J. MoL- Biol. 32:101-1081, zonal rotor centrif ugation (to effect separation on 
the basis of chromosome size and density) [see, e.g., Burki et aL (1973) 
Prep. Biochem. 3:157-182; Stubblefield et aL (1978) Biochem. Biophvs. Res. 
20 Commun. 83:1404-1414, velocity sedimentation (to effect separation on the 
basis of chromosome size and shape) [see e.g., Collard et a_L (1984) 
Cytometry 5:9-193. 




Affinity-, particularly immunoaffinity-, based methods for separation of 
ACs from endogenous chromosomes are also provided herein. For example, 
25 artificial chromosomes which are predominantly heterochromatin may be 
separated from endogenous chromosomes through immunoaffinity 
procedures involving antibodies that specifically recognize heterochromatin, 
and/or the proteins associated therewith, when the endogenous 
chromosomes contain relatively little heterochromatin. 
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Immuno-affinity purification may also be employed in larger scale 
artificial chromosomes isolation procedures. In this process, large 
populations of artificial chromosome-containing cells (asynchronous or 
mitotically enriched) are harvested en masse and the mitotic chromosomes 
5 (which can be released from the cells using standard procedures such as by 
incubation of the cells, such as freshly isolated protoplasts, in hypotonic 
buffer and/or detergent treatment of the cells in conjunction with physical 
disruption of the treated cells) are enriched by binding to antibodies that are 
bound to solid state matrices (e.g. column resins or magnetic beads). 

10 Antibodies suitable for use in this procedure bind to condensed centromeric 
proteins or condensed and DNA-bound histone proteins. For example, 
autoantibody LU851 (see Hadlaczky et aL (1989) Chromosoma 97 :282-288). 
which recognizes mammalian centromeres, may be used for large-scale 
isolation of chromosomes prior to subsequent separation of artificial 

15 chromosomes from endogenous chromosomes using methods such as FACS. 
The bound chromosomes would be washed and eventually eluted for sorting. 

Immunoaffinity purification may also be used directly to separate 
artificial chromosomes from endogenous chromosomes. For example, in the 

20 case of artificial chromosomes that are predominantly heterochromatic, the 
artificial chromsomes may be generated in or transferred to (e.g., by 
microinjection or microcell fusion as described herein) a cell line that has 
chromosomes that contain relatively small amounts of heterochromatin, such 
as hamster cells (e.g., V79 cells or CHO-K1 cells). The predominantly 

25 heterochromatic artificial chromosomes are then separated from the 

endogenous chromosomes by utilizing anti-heterochromatin binding protein 
(Drosophila HP-1) antibody conjugated to a solid matrix. Such matrix 
preferentially binds artificial chromosomes relative to hamster chromosomes. 
Unbound hamster chromosomes are washed away from the matrix and the 

30 artificial chromosomes are eluted by standard techniques. Similarly, artificial 
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chromosomes of one species, e.g. , a plant-derived artificial chromosome, 
may be separated from a background of endogenous chromosomes of 
another species, e.g. , animal, such as mammalian, chromosomes, based on 
immunological differences of the two species, provided that antibodies that 
5 specifically recognize one species and not the other are available or can be 
generated. 

D. Generation of Artificial Chromosomes Through Assembly of 
Component Elements 

Artificial chromosomes can be constructed in vitro by assembling the 

lO structural and functional elements that contribute to a complete chromosome 
capable of stable replication and segregation alongside endogenous 
chromosomes in cells. The identification of the discrete elements that in 
combination yield a functional chromosome has made possible the in vitro 
assembly of artificial chromosomes. The process of in vitro assembly of 

15 artificial chromosomes, which can be rigidly controlled, provides advantages 
that may be desired in the generation of chromosomes that, for example, are 
required in large amounts or that are intended for specific use in transgenic 
organism systems. 

For example, in vitro assembly may be advantageous when efficiency 

20 of time and scale are important considerations in the preparation of artificial 
chromosomes. Because in vitro assembly methods do not involve extensive 
cell culture procedures, they may be utilized when the time and labor 
required to transform, feed, cultivate, and harvest cells used in de novo cell- 
based production systems is unavailable. 

25 Provided herein are in vitro assembly methods that include the joining 

of essential components, such as a centromere, telomere and an origin of 
replication, to yield an artificial chromosome, in particular, an artificial 
chromosome that functions in plants and that may contain components 
derived from plant chromosomes. Also provided are artificial chromosomes 

30 produced by the methods. Particular embodiments of the methods and 
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chrornosomes include a megreplicator. The megareplicator may contain 
rDNA, for example, mammalian or plant rDNA. in vitro assembled artificial 
chromosomes may contain any amount of heterochromatic and/or 
euchromatic nucleic acid. For example, an in vitro assembled artificial 
5 chromosome may be substantially all heterochromatin, while still containing 
protein-encoding DNA, or may contain increasing amounts of euchromatic 
DNA, such that, for example, it contains about 10%, 20%, 30%, 40%, 
50%, 60%, 70%, 80%, 90% or greater than about 90% euchromatic DNA. 

in vitro assembly may also be rigorously controlled with respect to the 

10 exact manner in which the several elements of the desired artificial 

chromosome are combined and in what sequence and proportions they are 
assembled to yield a chromosome of precise specifications. This feature is 
of particular significance in the generation of plant artificial chromosomes 
containing one or more regions of segmentation as described herein with 

15 reference to amplification-based artificial chromosomes. For example, certain 
plant chromosome structures (such as acrocentric chromosomes and/or 
chromosomes containing adjacent regions of heterochromatin and rDNA) that 
may be desirable for use in the generation of particular types of plant 
artificial chromosomes via amplification-based methods as described herein 

20 may be limited in number or may not exist. These particular types of plant 
artificial chromosomes, e.g., certain predominantly heterochromatic plant 
artificial chromosomes, may also be generated via in vitro assembly of 
artificial chromosomes as described herein. 

For example, plant artificial chromosomes containing regions of 

25 repeated nucleic acid units that are predominantly heterochromatic may be 
assembled by joining essential chromosomal components and repeat regions, 
or may be generated from an in vitro assembled artificial chromosome via 
amplification of heterochromatic DNA contained within an in vitro assembled 
artificial chromosome. For generation of such chromosomes via amplification 

30 of heterochromatic DNA contained within an in vitro assembled artificial 
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chromosome, nucleic acids are introduced into a cell containing an in vitro 
assembled artificial chromosome and a resulting cell is selected that contains 
an artificial chromosome containing one or more regions of repeated nucleic 
acid units that are predominantly heterochromatic. The in vitro assembled 
5 artificial chromosome either contains a megareplicator to faciliate 

amplification of chromosomal DNA in connection with integration of nucleic 
acid into the chromosome or megareplicator-containing DNA is included in 
the nucleic acid that is integrated into thee in vitro assembled artificial 
chromosome. 

10 The following describes the processes involved in the assembly of 

artificial chromosomes in vitro, utilizing a megachromosome as exemplary 
starting material. 

1 . Identification and isolation of the components of the artificial 
chromosome 

15 The chromosomes provided herein are elegantly simple chromosomes 

for use in the identification and isolation of components to be used in the in 
vitro assembly of expression systems or artificial chromosomes. The ability 
to purify artificial chromosomes to a very high level of purity, as described 
herein, facilitates their use for these purposes. For example, the 

20 megachromosome, particularly truncated forms thereof, serve as starting 
materials. With respect to the construction of an artificial chromosome 
containing at least some mammalian cell derived components, possible 
starting materials can be obtained from, for example, cell lines such as 1B3 
and mM2C1, which are derived from H1D3 (deposited at the European 

25 Collection of Animal Cell Culture (ECACC) under Accession No. 96O40929). 
With respect to the construction of an artificial chromosome containing at 
least some plant cell derived components, possible starting materials include 
cells containing PACs, e.g. , megachromosomes, generated as described 
herein. 
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For example, the mM2C1 cell line contains a micro-megachromosome 
( — 50-60 kB), which advantageously contains only one centromere, two 
regions of integrated heterologous DNA with adjacent rDNA sequences, with 
the remainder of the chromosomal DNA being mouse major satellite DNA. 
5 Other truncated megachromosomes can serve as a source of telomeres, or 
telomeres can be provided. The centromere of the mM2C1 cell line contains 
mouse minor satellite DNA, which provides a useful tag for isolation of the 
centromeric DNA. 

Additional features of particular ACs provided herein, such as the 

10 micro-megachromosome of the mM2C1 cell line, that make them uniquely 
suited to serve as starting materials in the isolation and identification of 
chromosomal components include the fact that the centromeres of each 
megachromosome within a single specific cell line are identical. The ability 
to begin with a homogeneous centromere source (as opposed to a mixture of 

15 different chromosomes having differing centromeric sequences) greatly 
facilitates the cloning of the centromere DNA. By digesting purified 
megachromosomes, particularly truncated megachromosomes, such as the 
micro-megachromosome, with appropriate restriction endonucleases and 
cloning the fragments into commercially available and well known YAC 

20 vectors (see, e.g. . Burke et aL (1987) Science 236 :806-81 2), BAC vectors 
(see, e.g. , Shizuya et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89: 8794- 
8797 bacterial artificial chromosomes which have a capacity of incorporating 
0.9 - 1 Mb of DNA) or PAC vectors (the P1 artificial chromosome vector 
which is a P1 plasmid derivative that has a capacity of incorporating 300 kb 

25 of DNA and that is delivered to E± coli host cells by electroporation rather 
than by bacteriophage packaging; see, e.g. . loannou et aL (1994) Nature 
Genetics 6:84-89; Pierce et aL (1992) Meth. Enzvmol. 2T_6:549-574; Pierce 
et aL (1992) Proc. Natl. Acad. Sci. U.S.A. 89 :2056-2060: U.S. Patent No. 
5,300,431 and International PCT application No. WO 92/14819) vectors, it 
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plant satellite DNA, the heterologous DNA and/or rDNA, may be used to 
identify and eliminate the non-centromeric DNA-containing clones. 

Additionally, centromere cloning methods described herein may be 
utilized to isolate the centromere-containing sequence of the 
5 megachromosome. 

Once the centromere fragment has been isolated, it may be sequenced 
and the sequence information may in turn be used in PCR amplification of 
centromere sequences from megachromosomes or other sources of 
centromeres. Isolated centromeres may also be tested for function in vivo by 
lO transferring the DNA into a host cell. Functional analysis may include, for 
example, examining the ability of the centromere sequence to bind 
centromere-binding proteins. The cloned centromere will be transferred to 
cells with a selectable marker gene and the binding of a centromere-specific 
protein, such as anti-centromere antibodies ( e.g. , LU851, see, Hadlaczky et 
15 aL (1986) Exp. Cell Res. 167 :1-15) can be used to assess function of the 
centromeres. 

b. Telomeres 

Telomeres that may be used in assembiy of an artificial chromosome 
include a 1 kB synthetic telomere (see, e.g. , PCT Application Publication No. 

20 WO 97/4-0183). A double synthetic telomere construct, which contains a 1 
kB synthetic telomere linked to a dominant selectable marker gene that 
continues in an inverted orientation may be used for ease of manipulation. 
Such a double construct contains a series of TTAGGG repeats 3' of the 
marker gene and a series of repeats of the inverted sequence, i.e., GGGATT, 

25 5' of the marker gene as follows: 

(GGGATTT) n — dominant marker gene — (TTAGGG) n . Using an inverted 
marker provides an easy means for insertion, such as by blunt end ligation, 
since only properly oriented fragments will be selected. 

Telomere sequences also include sequences described in plants, for 

30 example, an Arabidopsis sequence containing head-to-tail arrays of the 
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monorner repeat CCCTAAA totaling a few, for example 3-4, kb in length. 
Telomere sequences vary in length and do not appear to have a strict length 
requirement. An example of a cloned telomere is found in GenBank 
accession no. M20158 (Richards and Ausubel (1988) Cell 53: 1 27-1 36) and 
5 in U.S. Patent No. 5,270,201. Yeast telomere sequences include those 
provided in GenBank accession no. S70807 (Louis et al. (1994) Yeast 
70:27 1-274). Additionally, a method for isolating a higher eukaryotic 
telomere from A. thai/ana has been reported (Richards and Ausubel (1988) 
Cell 55:127-136; and U.S. Patent No. 5,27O,201). 
1 0 c . Megareplicator 

The megareplicator sequences, such as those containing rDNA, 
provided herein are preferred for use in artificial chromosomes generated by 
assembly of component elements in vitro. The rDNA provides an origin of 
replication and also provides sequences that facilitate amplification of the 

15 artificial chromosome in vivo to increase the size of the chromosome to, for 
example, accommodate increasing copies of a heterologous gene of interest 
as well as continuous high levels of expression of the heterologous genes, 
d. Filler heterochromia tin 
Filler heterochromatin, particularly satellite DNA, is included to 

20 maintain structural integrity and stability of the artificial chromosome and 
provide a structural base for carrying genes within the chromosome. The 
satellite DNA is typically A/T-rich DNA sequence, such as mouse major 
satellite DNA, or G/C-rich DNA sequence, such as hamster natural satellite 
DNA. Sources of such DNA include any eukaryotic organisms that carry 

25 non-coding satellite DNA with sufficient A/T or G/C composition to promote 
ready separation by sequence, such as by FACS, or by density gradients. 
Examples of plant satellite DNA include, but are not limited to, satellite DNA 
of soybean (see, e.g., Morgante et al. (1997) Chromosome Res. 5:363-373; 
and Vahedian et al. (1995) Plant Mol. Biol. 2^:857-862), satellite DNA on 

30 the rye B chromosome (see, e.g., Langdon et al. (2000) Genetics 754:869- 
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884) and satellite DNA in the Saccharum complex (see, e.g., Alix et aL 
(1998) Genome 4/:854-864). The satellite DNA may also be synthesized by 
generating sequence containing monotone, tandem repeats of highly A/T- or 
G/C-rich DNA units. 
5 The most suitable amount of filler heterochromatin for use in 

construction of the artificial chromosome may be empirically determined by, 
for example, including segments of various lengths, increasing in size, in the 
construction process. Fragments that are too small to be suitable for use will 
not provide for a functional chromosome, which may be evaluated in cell- 

10 based expression studies, or will result in a chromosome of limited functional 
lifetime or mitotic and structural stability. 

e. Selectable marker 
Any convenient selectable marker, including specific examples 
described herein, may be used and at any convenient locus in the expression 

15 system. 

2. Combination of the isolated chromosomal elements 
Once the isolated elements are obtained, they may be combined to 
generate the complete, functional artificial chromosome expression system. 
This assembly can be accomplished for example, by in vitro ligation either in 

20 solution, LMP agarose or on microbeads. The ligation is conducted so that 
one end of the centromere is directly joined to a telomere. The other end of 
the centromere, which serves as the gene-carrying chromosome arm, is built 
up from a combination of satellite DNA and megarepiicator sequences, e.g., 
rDNA sequence, and may also contain a selectable marker gene. Another 

25 telomere is joined to the end of the gene-carrying chromosome arm. The 

gene-carrying arm is the site at which any heterologous genes of interest, for 
example, in expression of desired proteins encoded thereby, are incorporated 
either during in vitro assembly of the chromosome or sometime thereafter. 
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3. Analysis and testing of the artificial chromosome expression 
systems 

Artificial chromosomes assembled in vitro may be tested for 
functionality in cell systems, such as plant and animal cells, using any of the 
5 methods described herein for the artificial chromosomes, minichromosomes, 
or known to those of skill in the art. 

4. Introduction of desired heterologous DNA into the in vitro 
assembled chromosome 

Heterologous DNA may be introduced into the in vitro synthesized 

10 chromosome using routine methods of molecular biology, may be introduced 
using the methods described herein for the artificial chromosomes, or may be 
incorporated into the in vitro assembled chromosome as part of one of the 
synthetic elements, such as the heterochromatin. The heterologous DNA 
may be linked to a selected repeated fragment, and then the resulting 

1 5 construct may be amplified in vitro using the methods for such in vitro 
amplification provided herein. 

In a particular embodiment of these in vitro assembly methods, a site- 
specific recombination site is included in the assembly DNA or is added into 
the assembled chromosome, such as a plant in vitro assemble artificial 

20 chromosome, after initial assembly. The presence of a recombination site in 
the in vitro assembled artificial chromosome facilitates recombinase-catalyzed 
introduction of heterologous nucleic acid into the chromosome if the 
heterologous nucleic acid also contains a complementary recombination site. 
Such recombination systems include, but are not limited to, Cre//ox [see, 

25 e.g., Dale and Ow (1995) Gene 9 7:79-85], FLP/FRT [see, e.g., Nigel et ai. 

(1995) The Plant Journal 5:637-652), WHS [see, e.g., Onouchi et al. (1991) 
Nuc. Acids Res. 73:6373-6378], G\n/gix [see, e.g., Maeser and Kahman 
(1991) Moi. Gen. Genet. 230:170-176] and int/aff. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 

30 integrase recombinase in conjunction therewith to permit engineering of 

natural and artificial chromosomes is desribed in copending U.S. provisional 
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application Serial No. 60/294,758, by Perkins eta/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2001, U.S. 
provisional application Serial No. 60/366,891, by Perkins era/, entitled 
"CHROMOSOME-BASED PLATFORMS" filed on March 21, 2002, U.S. patent 
5 application Serial No. , by Perkins et a/, entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2002, under attorney docket no. 

24601-420, and PCT International Application No. , by Perkins eta/. 

entitled "CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, 
under attorney docket no. 24601-420PC, each of which is incorporated 
10 herein in its entirety by reference thereto. Thus, also contemplated herein 
are in vitro assembled artificial chromosomes, in particular such 
chromosomes containing plant chromosome-derived components, that 
contain one or more recombination sites, such as an att site. 

E. Methods for the Production of Plant Acrocentric Chromosomes and 
15 Plant Chromosomes Containing Adjacent Regions of rDNA and 

Heterochromatin 

Acrocentric human and mouse chromosomes in which the short arm 
contains only pericentric heterochromatin, an rDNA array, and telomeres can 
be used in the de novo formation of a satellite DNA based artificial 

20 chromosome (SATAC, also referred to as ACes). In some embodiments of 
the methods of producing a plant artificial chromosome provided herein, it 
may be desirable to introduce heterologous nucleic acids into a plant 
chromosome with arms of unequal length (e.g., into the short arm of an 
acrocentric chromosome) and/or containing adjacent regions of rDNA and 

25 heterochromatin, such as pericentric heterochromatin or satellite DNA. Of 
particular interest in such methods are plant acrocentric chromosomes that 
contain rDNA located adjacent to the pericentric heterochromatin or satellite 
DNA, and, in particular, on the short arm of the chromosome with little to no 
euchromatic DNA between the rDNA and the pericentric heterochromatin. 

30 Utilizing such structures as the initial composition in the generation of plant 
artificial chromosomes may facilitate generation of plant artificial 
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chromosomes that are predominantly heterochrornatic. For example, 
introduction of heterologous nucleic acid into a cell containing such an 
acrocentric plant chromosome such that the nucleic acid integrates into the 
pericentric heterochromatin and/or rDIMA of the short arm of the chromosome 
5 may be associated with amplification (possibly through "megareplicator" 

DNA sequences such as may reside in plant rDNA arrays, also known as the 
nucleolar organizing regions (NOR)) of heterochromatin that leads to the 
formation of a predominantly heterochrornatic plant artificial chromosome. 
Naturally occurring acrocentric plant chromosomes are limited in 

lO number, and plant chromosomes with a structure that includes adjacent 

regions of heterochromatin and rDNA may not exist or may not exist for a 
variety of plant species. Provided herein are methods for generating 
acrocentric plant chromosomes and plant chromosomes containing adjacent 
regions of rDNA^and heterochromatin, in particular, pericentric and/or 

15 satellite heterochromatin. Further provided herein are methods for generating 
acrocentric plant chromosomes containing adjacent regions of 
heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
and rDNA on the short arm of the chromosome. 

Also provided herein are plant acrocentric chromosomes in which the 

20 nucleic acid of one or both arms of the chromosome contains less than about 
50%, or less than about 40%, or less than about 30%, or less than about 
20%, or less than about 10%, or less than about 5%, or less than about 
2%, or less than about 1 %, or less than about 0.5% or less than about 
0.1% euchromatin. In some embodiments of these chromosomes, the 

25 nucleic acid of only one arm, either the short arm or the long arm, contains 
less than these specified amounts of euchromatin. In a particular 
embodiment of these chromosomes, the nucleic acid of the short arm 
contains less these specified amounts of euchromatin. 

Further provided herein are plant chromosomes containing adjacent 

30 regions of heterochromatin, in particular pericentric heterochromatin or 
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satellite DNA, and rDNA with little to no euchromatin between the two 
regions. With reference to such plant chromosomes, "litte to no" means that 
the amount of euchromatic DNA, if any, located between the rDNA and 
heterochromatin (such as pericentric heterochromatin and/or satellite DNA), 
5 generally does not stain diffusely and recognizably as euchromatin and/or 
does not contain protein-encoding genes. Thus, in these chromosomes, 
between the heterochromatin (such as pericentric heterochromatin and/or 
satellite DNA) and the rDNA, there is substantially no chromatin that is less 
condensed than the heterochromatin {e.g. , pericentric heterochromatin). The 

10 plant chromosomes containing adjacent regions of rDNA and 

heterochromatin (such as pericentric heterochromatin) provided herein may 
be acrocentric chromosomes. In a particular embodiment of these plant 
chromosomes, the adjacent regions of rDNA and heterochromatin, in 
particular pericentric heterochromatin, are contained on the short arm of the 

15 chromosome. 

Further provided are methods of utilizing such plant chromosomes in 
the generation of plant artificial chromosomes, and, in particular, 
predominantly heterochromatic plant artificial chromosomes, such as ACes 
(also referred to as SATACs). In particular methods of producing plant 

20 artificial chromosomes provided herein, nucleic acids are introduced into a 
cell containing a plant chromosome that is acrocentric and/or contains 
adjacent regions of rDNA and heterochromatin, such as pericentric 
heterochromatin, the cells are cultured through at least one cell division and 
a cell comprising an artificial chromosome, such as a predominantly 

25 heterochromatic artificial chromosome, is selected. In these methods, the 
plant chromosome into which nucleic acid is introduced may be an 
acrocentric chromosome containing adjacent regions of rDNA and 
heterochromatin on the short or long arm, and, in particular, on the short 
arm. 
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The plant chromosomes provided herein can be generated using site- 
specific recombination between plant chromosome regions. The regions may 
be on the same chromosome or separate chromosomes. Through site- 
specific recombination, sections of plant chromosomes may be altered to 
5 remove, invert and/or insert sequences such that a desired plant 

chromosome results. The resulting plant chromosome is acrocentric and/or 
contains adjacent regions of heterochromatic DNA and rDNA, which may or 
may not be on the short arm of an acrocentric chromosome. Thus, the 
starting chromosome in these methods may be a plant chromosome or may 

10 be a plant acrocentric chromosome that does not contain adjacent regions of 
rDNA and heterochromatin, such as pericentric heterochromatin or satellite 
DNA. If the starting chromosome is acrocentric, then it may be used in the 
generation of a plant acrocentric chromosome that contains adjacent regions 
of heterochromatic DNA (e.g., pericentric heterochromatin and/or satellite 

15 DNA) and rDNA, particularly on the short arm of the chromosome, or to 

generate a plant acrocentric chromosome in which the nucleic acid of one or 
both arms contains less than about 50%, or less than about 40%, or less 
than about 30%, or less than about 20%, or less than about 10%, or less 
than about 5%, or less than about 2%, or less than about 1 %, or less than 

20 about 0.5% or less than about 0.1% euchromatin. 

In one of the methods provided herein for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of rDNA 
and heterochromatin, nucleic acid containing a site-specific recombination 
site and nucleic acid containing a complementary site-specific recombination 

25 site are introduced into a cell containing one or more plant chromosomes. 
The nucleic acids may be introduced into the cell sequentially or 
simultaneously. The nucleic acids may also be targeted to particular 
chromosomes and/or particular sequences of a chromosome. Such targeting 
may be accomplished by including in the nucleic acids sequences 

30 homologous to particular sequences in the chromosome(s). 
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The cell is then exposed to a recombinase activity. The recombinase 
activity can be provided by introduction of nucleic acid encoding the activity 
into the cell for expression of the activity therein, or may be added to the cell 
from an exogenous source. The recombinase activity is one that catalyzes 
5 recombination between sequences at the two recombination sites. An 
appropriate recombination event produces a plant chromosome that is 
acrocentric and/or contains adjacent regions of rDNA and heterochromatin 
(such as pericentric heterochromatin and/or satellite DNA) which may be 
readily identified therein based on its particular structure (e.g., arms of 

10 unequal length if the chromosome is acrocentric) and/or other features, e.g., 
the presence of particular added sequences, such as recombination sites and 
DNA encoding a selectable marker, the absence of particular sequences, 
such as excised euchromatic DNA, and the arrangement of sequences, such 
as the placement of rDNA segments adjacent to pericentric heterochromatin 

15 and/or satellite DNA. Such attributes may be detected using techniques 

known in the art for the analysis of nucleic acids and chromosomes, such as, 
for example, in situ hybridization. 

A number of site-specific recombination systems may be used in the 
production of plant chromosomes that are acrocentric and/or contain rDNA 

20 adjacent to heterochromatin, such as pericentric heterochromatin, as 

described herein. Such systems include, but are not limited to, Cre/fox [see, 
e.g., Dale and Ow (1995) Gene 9 7:79-85], FLP/FRT [see, e.g., Nigel eta/. 
(1995) The Plant Journal 8:637-652], R//?S [see, e.g., Onouchi et al. (1991) 
Nuc. Acids Res. 79:6373-6378], Gln/gix [see, e.g., Maeser and Kahman 

25 (1991) Mol. Gen. Genet. 230:170-176] and intlatt. The introduction of att 
recombination sites into a chromosome and the use of lambda phage 
integrase recombinase in conjunction therewith to permit engineering of 
natural chromosomes is desribed in copending U.S. provisional application 
Serial No. 60/294,758 by Perkins et al. entitled "CHROMOSOME-BASED 

30 PLATFORMS" filed on May 30, 2001, U.S. provisional application Serial No. 
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60/366,891, by Perkins eta/, entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 

, by Perkins eta/, entitled "CHROMOSOME-BASED PLATFORMS" filed 

on May 30, 2002, under attorney docket no. 24601-420, and PCT 



"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601 -420PC, each of which is incorporated herein in 
its entirety by reference thereto. These systems, as well as others known in 
the art, can be used to specifically excise or invert DNA (for example, in an 
intrachromosomal recombination), exchange regions of DNA {for example, in 
an inter-chromosomal recombination) or insert DNA (for example, through 
recombination between homologous sequences at a recombination site and 
the DNA to be inserted). The precise event is controlled by the orientation of 
the recombination site DNA sequences. 

In particular embodiments of the methods for producing an acrocentric 
plant chromosome provided herein, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA (in particular, proximal satellite DNA) of one plant chromosome 
in the cell. In a further embodiment, nucleic acid containing complementary 
recombinase recognition sites for site-specific recombination is introduced 
into a cell containing one or more plant chromosomes wherein one of the 
sites integrates into the dista! end of an arm of a plant chromosome in the 
cell. In these embodiments, recombination between the sites in the presence 
of a recombinase that recognizes the sites can result in deletion of a portion 
of an arm of a chromosome, reciprocal translocation between a distal portion 
of a chromosome arm and a more proximal portion of another chromosome 
arm or reciprocal translocation between pericentric heterochromatin and/or 
satellite DNA of one chromosomal arm and a more distal portion of another 
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chromosome arm. Each of these recombination events can serve to reduce 
the length of a chromosome arm and give rise to an acrocentric 
chromosome. 

In another embodiment, a nucleic acid containing a site-specific 
5 recombination site is introduced into a cell containing plant chromosomes 
wherein it integrates into the pericentric heterochromatin and/or satellite 
DNA of one plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of an arm of another plant 

10 chromosome in the cell. In this embodiment, recombination between the 

sites in the presence of a recombinase that recognizes the sites can result in 
reciprocal translocation between the pericentric heterochromatin and/or 
satellite DNA of one chromosome and the distal portion of another 
chromosome arm thereby bringing these two regions into close proximity on 

15 one chromosomal arm and reducing the amount of DNA between the 
pericentric region of the arm and the end of the arm to generate an 
acrocentric plant chromosome. 

These methods for producing an acrocentric plant chromosome may 
also be conducted such that nucleic acid containing a site-specific 

20 recombination site is introduced into a cell containing a plant chromosome 

wherein it integrates into, or close to, the pericentric heterochromatin and/or 
satellite DNA of a plant chromosome in the cell and nucleic acid containing a 
complementary site-specific recombination site is introduced into the cell 
wherein it integrates into the distal end of the same arm of the same 

25 chromosome. In this embodiment, recombination between the sites in direct 
(i.e., the same, or head-to-tail) orientation in the presence of a recombinase 
that recognizes the sites can result in intrachromosomal recombination 
between the pericentric heterochromatin (and/or satellite DNA) and the distal 
portion of the chromosomal arm thereby excising DNA between these two 



WO 2002/096923 



PCT/U S2002/0 17451 



-106- 

regions and reducing the amount of DNA between them to generate an 
acrocentric plant chromosome. 

In particular embodiments of the methods provided herein for 
producing a plant chromosome containing adjacent regions of rDNA and 
5 heterochromatin, such as pericentric heterochromatin and/or satellite DNA, 
nucleic acid containing complementary recombinase recognition sites for site- 
specific recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into heterochromatin of 
one plant chromosome in the cell. In a further embodiment, nucleic acid 

10 containing complementary recombinase recognitions sites for site-specific 
recombination is introduced into a cell containing one or more plant 
chromosomes wherein one of the sites integrates into rDNA or a nucleolar 
organizing region (NOR) of a plant chromosome in the cell. In these 
embodiments, recombination between the sites in the presence of a 

15 recombinase that recognizes the sites can result in deletion of DNA between 
a heterochromatic region, such as the pericentric heterochromatin (and/or 
satellite DNA), and rDNA, inversion of DNA that includes heterochromatin or 
rDNA of a plant chromosome or reciprocal translocation between 
heterochromatin of one chromosomal arm and rDNA of another chromosomal 

20 arm. Each of these recombination events can serve to arrange chromosomal 
DNA such that a region of heterochromatic DNA, such as pericentric 
heterochromatin and/or satellite DNA, is adjacent to a region of rDNA on a 
plant chromosome. 

In another embodiment, nucleic acid containing a site-specific 

25 recombination site is introduced into a cell containing plant chromosomes 

wherein it integrates into heterochromatin, such as, for example, pericentric 
heterochromatin and/or satellite DNA, of one plant chromosome in the cell 
and nucleic acid containing containing a complementary site-specific 
recombination site is introduced into the cell wherein it integrates into rDNA 

30 of another plant chromosome in the cell. In this embodiment, recombination 
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between the sites can result in reciprocal translocation between the 
heterochromatin of one chromosome and the rDNA of another chromosome 
thereby bringing these two regions into close proximity on one plant 
chromosome with little to no euchromatin between them. 
5 These methods for producing a plant chromosome containing adjacent 

regions of heterochromatic DNA and rDNA may also be conducted such that 
nucleic acid containing site-specific recombination sites is introduced into a 
cell containing a plant chromosome wherein it integrates into 
heterochromatin, for example, pericentric heterochromatin and/or satellite 

lO DNA, of a plant chromosome and nucleic acid containing a complementary 
site-specific recombination site is introduced into the cell wherein it 
integrates into rDNA of the same chromosome. In this embodiment, 
recombination between the sites in direct orientation in the presence of a 
recombinase that recognizes the sites can result in intrachromosomal 

1 5 recombination between heterochromatin, such as pericentric heterochromatin 
(and/or satellite DNA), and rDNA thereby excising DNA, including 
euchromatic DNA, between these two regions. Recombination of the sites in 
indirect (i.e., head-to-head) orientation in the presence of a recombinase can 
result in inversion of DNA between the sites thereby replacing DNA, such as 

20 euchromatin, located between pericentric heterochromatin (and/or satellite 

DNA) and rDNA on the chromosome with rDNA. Thus, in the resulting plant 
chromosome, rDNA is located adjacent to pericentric heterochromatin (and/or 
satellite DNA), and DNA that was present between the pericentric 
heterochromatin (and/or satellite DNA) and the rDNA is located distal to the 

25 rDNA in a position previously occupied by the rDNA. 

In particular embodiments for producing an acrocentric plant 
chromosome containing adjacent regions of heterochromatin, such as 
pericentric heterochromatin (and/or satellite DNA), and rDNA, the short arm 
of the acrocentric chromosome may be generated in the same recombination 

30 event that places the heterochromatin and rDNA regions adjacent to each 
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other or in a separate recombination event. For example, nucleic acid 
containing a site-specific recombination site may be introduced into a cell 
containing one or more plant chromosomes wherein it integrates into the 
pericentric heterochromatin of one plant chromosome and nucleic acid 
5 containing a complementary site-specific recombination site may be 

introduced into the cell wherein it integrates into rDNA that is located at a 
distal portion of another plant chromosome or the same arm of the same of 
the same chromosome. Recombination of the sites in the presence of a 
recombinase can result in intra- or inter-chromosomal recombination that not 

10 only brings the pericentric heterchromatin (and/or satellite DNA) and rDNA 
into close proximity on one chromosomal arm, but also sufficiently reduces 
the length of that arm such that the resulting chromosome is acrocentric. 

If a single recombination event such as this does not generate an 
acrocentric plant chromosome, multiple recombination events may be used to 

15 produce an acrocentric plant chromosome containing adjacent regions of 

heterochromatic DNA and rDNA. For example, nucleic acid containing a site- 
specific recombination site may be introduced into a cell containing one or 
more plant chromosomes wherein it integrates into the pericentric 
heterochromatin (and/or satellite DNA) of one plant chromosome and nucleic 

20 acid containing a complementary site-specific recombination site may be 
introduced into the cell wherein it integrates into rDNA of the same or a 
different plant chromosome. As described abouve, recombination between 
the sites in the presence of a recombinase can result in deletion, inversion or 
reciprocal translocation of DNA to arrange chromosomal DNA such that 

25 pericentric heterochromatin (and/or satellite DNA) is adjacent to a region of 
rDNA on a plant chromosome. In order to reduce the length of the arm of 
the chromosome on which the adjacent regions of heterochromatin and rDNA 
are located, an additional recombination event can be induced by introducing 
nucleic acid containing a site-specific recombination site into a cell containing 

30 this plant chromosome wherein it integrates into a region of the chromosome 
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distal to the rDNA and nucleic acid containing a complementary site-specific 
recombination site into the cell wherein it integrates into the distal end of the 
same chromosome arm or of another plant chromosome arm. Recombination 
between the recognition sites can result in deletion or reciprocal translocation 
5 of DNA to reduce the length of the chromosome arm distal to the rDNA and 
give rise to an acrocentric plant chromosome containing adjacent regions of 
heterochromatin and rDNA on the short arm of the chromosome. 

In each of the aforementioned methods for producing a plant 
chromosome that is acrocentric and/or contains adjacent regions of 

10 heterochromatin and rDNA, the nucleic acid containing the two or more 

recombination sites may be introduced simultaneously or sequentially into a 
cell or cells using nucleic acid transfer methods described herein or known in 
the art. The nucleic acids may randomly integrate into plant chromosomes or 
may be targeted for integration into a particular region or site on a plant 

15 chromosome through homologous recombination between sequences in the 
nucleic acid and sequences within the chromosome. The recombinase 
activity may be provided by introduction of nucleic acid encoding an 
appropriate recombinase into the cell for expression therein. The 
recombinase-encoding nucleic acid may be introduced into the cell prior to, 

20 during or after introduction of nucleic acids encoding recombination sites. 

To facilitate identification of cells containing the transferred nucleic 
acids and/or in which a recombination event has occurred, nucleic acid 
encoding a selectable marker may be introduced into the cell. For example, 
one or both of the nucleic acids containing a recombination site may also 

25 contain DNA encoding a selectable marker {e.g., a resistance-encoding 
marker or a reporter molecule) operatively linked to a promoter which is 
oriented such that integration of the nucleic acid into a chromosome places 
the marker DNA between two directly oriented recombination sites on an arm 
of a chromosome. A cell containing the nucleic acid will thus be resistant to 

30 a selection agent or will detectably express a reporter molecule. Exposure of 
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the cell to the appropriate recombinase can result in a recombination event 
that excises the DNA between the two recombination sites, which includes 
DNA encoding the selectable marker. Thus, recombination could be detected 
as loss of reporter molecule expression or decreased resistance to a selection 
5 agent. After exposure to a recombinase, the cells into which nucleic 

acids containing recombination sites have been transferred may be analyzed 
for the presence of acrocentric plant chromosomes using, for example, FISH 
analysis and other chromosome visualization techniques. 

In another method provided herein for producing a plant chromosome 

lO that is acrocentric and/or contains adjacent regions of heterchromatin and 
rDNA, the recombination event or events that lead to formation of the 
chromosome occur through crossing of transgenic plants that contain 
chromosomes which contain complementary site-specific recombination 
sites. Thus, in one embodiment of these methods, nucleic acid containing a 

15 recombination site adjacent to nucleic acid encoding a selectable marker is 

introduced into a first plant cell and a first transgenic plant is generated from 
the first plant cell. Nucleic acid containing a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative 
linkage is introduced into a second plant cell from which a second transgenic 

20 plant is generated. The first and second transgenic plants are crossed to 
obtain one or more plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker, and a resistant 
plant that contains cells comprising a plant chromosome that is acrocentric 
and/or contains adjacent regions of heterochromatin and rDNA is selected. 

25 In an example of this method, nucleic acids containing site-specific 

recombination sites are introduced into cells of Nicotiana tabacum. The 
nucleic acids are introduced separately by infecting leaf explants with 
Agrobacterium tumefaciens which carries the kanamycin-resistance gene 
(Kan R ). Kanamycin-resistant transgenic plants are generated from the 

30 infected leaf explants. One transgenic plant contains nucleic acid encoding a 
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promoterless hygromycin-resistance gene preceded by a /ox-site specific 
recombination sequence (lox-hpt), the other plant contains a cauliflower 
mosaic virus 35S promoter linked to a fox sequence and the ere DNA 
recombinase coding region (35S-/ox-cre). The resultant Kan R transgenic 
5 plants are crossed (see, e.g., protocols of Qin eta/. (1994) Proc. Natl. Acad. 
Sci. U.S.A. 9 7:1 706-1 71 0, 1994). Plants in which the appropriate DNA 
recombination event has occurred are identified by hygromycin-resistance. 

The Kan R cultivars initially may be screened, such as by FISH, to 
identify two sets of candidate transgenic plants. One set has one construct 

lO integrated in regions adjacent to the pericentric heterochromatin (and/or 
satellite DNA) on the short arm of any chromosome. The second set of 
candidate plants has the other construct integrated in rDNA, such as the 
NOR region, of appropriate chromosomes. To obtain reciprocal translocation 
both sites must be in the same orientation. Therefore a series of crosses 

15 may be required, marker-resistant plants generated, and FISH analyses 

performed to identify an "acrocentric" plant chromosome or chromosomes 
that contain adjacent regions of heterochromatin. As described above, such 
an acrocentric chromosome may be used for de novo plant artificial 
chromosome formation, particularly predominantly heterochromatic plant 

20 artificial chromosomes. The selection of appropriate plant lines can be done, 
for example, using marker-assisted selection. 

F. Incorporation of Heterologous Nucleic Acids into Artificial 
Chromosomes 

Heterologous nucleic acids can be introduced into artificial 
25 chromosomes during or after formation. Incorporation of particular desired 
nucleic acids into an artificial chromosome during generation thereof may be 
accomplished by including the desired nucleic acids along with the nucleic 
acid encoding a selectable marker and any other nucleic acids used in 
artificial chromosome generation {e.g., targeting sequences that direct the 
30 heterologous nucleic acid to the pericentric region of a chromosome) in the 
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transformation of a cell to initiate amplification and formation of a artificial 
chromosomes. 

Alternatively, heterologous nucleic acids may be incorporated into an* 
artificial chromosome following formation thereof through transfection of a 
5 cell containing the artificial chromosome with the heterologous nucleic acids. 
In general, incorporation of such nucleic acids into the artificial chromosome 
is assured through site-directed integration, such as may be accomplished by 
including nucleic acids homologous or identical to DNA contained within the 
artificial chromosome in with the heterologous nucleic acid when transferring 
10 it to the artificial chromosome. An additional selective marker gene may also 
be included. 

Additionally, introduction of nucleic acids, particularly DNA molecules 
to an artificial chromosome can be accomplished by the use of site-specific 
recombinases as described herein {see, also, copending U.S. provisional 

15 application Serial No. 60/294,758 by Perkins eta/, entitled "CHROMOSOME- 
BASED PLATFORMS" filed on May 30, 2001, U.S. provisional application 
Serial No. 60/366,891, by Perkins et at. entitled "CHROMOSOME-BASED 
PLATFORMS" filed on March 21, 2002, U.S. patent application Serial No. 
, by Perkins et at. entitled "CHROMOSOME-BASED PLATFORMS" filed 

20 on May 30, 2002, under attorney docket no. 24601-420, and PCT 

International Application No. , by Perkins eta/, entitled 

"CHROMOSOME-BASED PLATFORMS" filed on May 30, 2002, under 
attorney docket no. 24601-420PC; each of which is incorporated in its 
entirety by reference thereto). Artificial chromosomes can be produced 

25 containing recombinase recognition sequences, to allow the site-specific 

introduction of DNA molecules into the same. Another use for an introduced 
recombinase site is to provide a region for site-specific integration of a new 
trait by the use of recombinase mediated gene insertion. 
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G. Introduction of Artificial Chromosomes into Plant Cells and Recovery 
of Plants Containing Artificial Chromosomes 

Artificial chromosomes can be introduced into plant cells by a variety 
of methods familiar to those skilled in the art. These methods include 
5 chemical and physical methods for introduction of foreign DNA, as well as 
cell culture methods to transfer chromosomes from one cell to another cell. 

Any type of artificial chromosome can be used. Plant artificial 
chromosomes (PACs) can be prepared by the in vivo and in vitro methods 
described herein. PACs can be prepared inside plant protoplasts and then 

10 transferred to other plant species and tissues, in particular to other plant 

protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper eta/. (1982) Plant Cell Physiol. 23:451-458; Krens eta/. (1982) 
Nature 72-74). PACs can be isolated from the protoplasts in which they 
were prepared, encapsulated into liposomes, and delivered to other plant 

15 protoplasts (Deshayes eta/. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs can be isolated and delivered directly to plant protoplasts, plant 
cells, or other plant targets via a PEG-mediated process, calcium phosphate- 
mediated process, electroporation, microinjection, (particle bombardment), 
lipid-mediated method with or without sonoporation, sonoporation alone, or 

20 any method known in the art as described herein (Haim et al. (1985) Mol. 

Gen. Genet. 199:161-168; Fromm et a/. (1986) Nature 319:791-793; Fromm 
et at. (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et al. (1987) 
Nature 327:70; Klein et aL (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 
and International PCT application publication no. WO 91/O0358). Plant 

25 artificial chromosomes can also be transferred to other plant species by 
preparation of protoplast-derived plant microcells, and fusion of the 
microcells containing the plant artificial chromosome with plant cells of other 
plant species. 

Mammalian artificial chromosomes (MACs) can be transferred to plant 
30 cells. Mammalian artificial chromosomes are prepared by the in vivo and in 



WO 2002/096923 



PCT/US2002/017451 



-114- 

vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application No. WO 97/40183. MACs can be prepared as 
microcells, and the microcells can be fused with plant protoplasts in the 
presence or absence of PEG (Dudits et at. (1976) Hereditas 82:121-123; 
5 Wiegland et aL (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
can be isolated and delivered directly to plant cells, protoplasts, and other 
plant targets using a PEG-mediated process, calcium phosphate-mediated 
process, electroporation, microinjection, lipid-mediated method with or 
without sonoporation, sonoporation alone, or any method known in the art as 

10 described herein and in US Patent Nos. 6,025,155 and 6,077,697, and 
International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transfection, the plant transformed plant 
targets can be developed using standard conditions into roots, shoots, 

15 plantlets, or any structure capable of growing into a plant. 

Accordingly, methods for the introduction of artificial chromosomes 
represent the first step in the production of plant cells and whole plants 
containing artificial chromosomes from a variety of sources. 

The ability to introduce genes into plants, such that they are stably 

20 expressed and transmissible from generation to generation, has 

revolutionized plant biology and opens up new possibilities for using plants 
as green factories for the production of commercially useful products as well 
as for other applications described herein. There are several approaches to 
the generation of stably transformed plants, and the adopted approach varies 

25 according to the aims of the project. For introduction of artificial 
chromosomes into plants, a variety of methods may be employed, 
transgenic plants, the transformation process involves the methods of foreign 
DNA delivery to plant host cells, the growth and analysis of transformed 
plant host cells, and the generation and regeneration of transgenic plants 

30 from transformed plant host cells. 
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1 . Introduction of artificial chromosomes into plant host cells 
Numerous methods for producing or developing transgenic plants are 
available to those of skill in the art. The method used is primarily a function 
of the species of plant. Artificial chromosomes containing heterologous 
5 DNA, such as artificial chromosomes prepared by the methods described 
herein, can be introduced into plant host cells, including, but not limited to, 
plant cells and protoplasts, by, for example, non-vector mediated DNA 
transfer processes (see, also copending U.S. application Serial No. 
09/815,979, which describes methods for delivery that can be adapted for 

10 use with plant cells and used with plant protoplasts). 

Non-vector mediated, or direct, gene transfer systems involve the 
introduction of heterologous DNA, in particular artificial chromosomes, into 
host cells, including but not limited to plant cells and protoplasts, without the 
use of a biological vector. The artificial chromosome that is introduced into 

15 these plant host cells can lead to the development of transformed, 
regenerable transgenic plants. The direct gene transfer systems for 
transgenic plants are designed to overcome the barrier to DNA uptake 
caused by the cell wall and the plasma membrane of plant cells. The 
approaches for direct gene transfer include, but are not limited to, chemical, 

20 electrical, and physical methods, which can also be adapted to optimize 
transfer of artificial chromosomes (see, e.g. , Uchimiya et ah (1989) J. of 
Biotech. 12: 1-20 for a review of such procedures, see also, e.g. , U.S. 
Patent Nos. 5,436,392; 5,489,520; Potrykus era/. (1985) Mot. Gen. Genet. 
733:183; Lorz era/, (1985) MoL Gen. Genet. 199:MB; Fromm eta/. (1985) 

25 Proc. Natl. Acad. Set. U.S.A. £2:5824-5828; Uchimiya eta/. (1986) Mot. 

Gen. Genet. 2(74:204; Callis et at. (1987) Genes Dev. 1A 1 83-2000; Callis et 
at. (1987) Nuc. Acids Res. 75:5823-5831; Marcotte et at. (1988) Nature 
355:454 and Toriyama et at. (1988) Bio/Tec hnotogy 6: 1 072-1 074). 
a. Chemical methods 
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Uptake of artificial chromosomes into plant cells, such as protoplasts, 
can be accomplished in the absence or presence of polyethylene glycol 
(PEG), which is a fusogen, or by any variations of such methods known to 
those of skill in the art [see, e.g. , U.S. Patent No. 4,684,61 1 to Schilperoot 
5 et aL; Paskowski eta/. (1984) EMBO J. 3:2717-2722; U.S. Patent Nos. 
5,231,019 and 5,453,367J. In one approach, plant protoplasts are 
incubated with a solution of foreign DNA, in particular artificial 
chromosomes, and PEG at a concentration that allows for high cell survival 
and high efficiency chromosome uptake. The protoplasts are then washed 

lO and cultured [Datta and Datta (1999) Meth. in Molecular Biol. 1 1 1 :335-348]. 
In an alternative approach, plant protoplasts are incubated with artificial 
chromosomes in the presence of calcium phosphate for direct artificial 
chromosome uptake (Haim et aL (1985) Mol. Gen. Genet. 1 99:1 61 -1 68). 
Alternatively, the artificial chromosome, in particular plant artificial 

15 chromosome (PAC), is formed in a plant protoplast which is, in turn, fused 
with another plant protoplast in the presence or absence of PEG to transfer 
the PAC to the plant host protoplast. Such methods for treating protoplasts 
with PEG and foreign DNA are well known in the art (Draper et at. (1982) 
Plant Cell Physiol. 23:451-458; Krens eta/. (1982) Nature 72-74). 

20 Another chemical direct gene transfer method involves lipid-mediated 

delivery of artificial chromosomes to plant protoplasts. In this process, 
liposomes with encapsulated artificial chromosomes are allowed to fuse with 
protoplasts alone or in the presence of PEG as the fusogen to transfer the 
foreign DNA, in particular artificial chromosome, to the plant host protoplast 

25 (Deshayes era/. (1985) EMBO J. 4:2731-2737; Fraley and Paphadjopoulos 
(1982) Curr Top Microbiol Immunol 96:171-191). 

Another direct gene transfer method involves the use of microceils. 
The chromosomes can be transferred by preparing microceils containing 
artificial chromosomes and then fusing the microceils with plant protoplasts. 

30 Methods for the preparation and fusion of microceils with other cells are well 
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known in the art (see Example No. 4 and see also, e.g. , U.S. Patent Nos. 
5,240,840; 4,806,476;5, 298,429; 5,396,767; Fournier (1981) Proc. Natl. 
Acad. Sci. U.S.A. 78 :6349-6353: and Lambert et aL (1991) Proc. Natl. 
Acad. Sci. U.S.A. 88 :5907-59: Dudits et at. (1976) Hereditas 82:121-123; 
5 Wiegland et aL (1987) J. Cell. Sci. Pt. 2 145-149). 
b. Electrical methods 
Electroporation, which involves high-voltage electrical pulses to a solution 
containing a mixture of protoplasts or plant cells and foreign DNA, in 
particular artificial chromosomes, to create nanometer-sized, reversible pores, 

10 is a common method to introduce DNA into plant cells or protoplasts. The 
exogenous DNA may be added to the protoplasts in any form such as, for 
example, naked linear, circular or supercoiled DNA, artificial chromosomes 
encapsulated in liposomes, DNA in spheroplasts, artificial chromosomes in 
other plant protoplasts, artificial chromosomes complexed with salts, and 

15 other methods. The foreign DNA, in particular artificial chromosome, can also 
include a phenotypic marker to identify plant cells that are successfully 
transformed. 

When plant cells or protoplasts are subjected to short electrical DC (direct 
current) pulses, they may experience an increase in the permeability of the 

20 plasma membrane and/or cell wall to hydrophilic molecules such as nucleic 
acids, which are normally unable to enter the plant cell directly. Nucleic 
acids are taken directly Into the cell cytoplasm either through these pores or 
as a consequence of the redistribution of membrane components that 
accompanies closure of the pores. Certain cell wall-degrading enzymes, such 

25 as pectin-degrading enzymes, may be employed to render the plant target 
recipient cells more susceptible to DNA or artificial chromosome uptake by 
electroporation than untreated cells. Plant recipient cells may also be 
susceptible to transformation by mechanical wounding. To effect 
transformation by electroporation, friable tissues such as a suspension 

30 culture of cells or embryonic callus may be used or immature embryos or 
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other organized tissues may be directly transformed (see, e.g., Fromm era/. 
(1986) Nature 3/S:791-793). Methods for effecting electroporation are well 
known in the art (see, e.g. , U.S. Patent Nos. 4,784,737; 4,970,154; 
5,304,486; 5,501,967; 5,501,662; 5,019,034; 5,503,999; see, also Fromm 
5 et aL (1985) Proc. Natl. Acad. Sci. U.S.A. 82:5824-5828; Zimmerman era/. 
(1981) Biophys Biochem Acta 641:160-165; Neuman era/. (1982) EMBO J. 
1:841-845; Riggs era/. (1986) Proc. Nat. Acad. Sci. USA 83:5602-5606; 
Lurquin (1997) Mol. Biotechnol. 7:5-35; Bates (1999) Methods in Molecular 
Biology 1 1 1:359-366). Electroporation can be used to introduce nucleic 
10 acids into tobacco mesophyll cells (Morikawa era/. (1986) Gene 41:121- 
124; leaf bases of rice (Dekeyser era/. (1990) Plant Cell 2:591-602; 
immature maize embryos (Songstad era/. (1993) Plant Cell Tiss. Orgn. Cult. 
40:1-15; macerated immature maize embryos (D'Halluin era/. (1992) Plant 
Cell 4:1495-1505; suspension cultured maize cells (Laursen eta/. (1994) 
15 Plant Mol. Biol. 24: 51-61; and sugar cane (Arencibia era/. (1995) Plant Cell 
Rep. 14:305-309). 

Artificial chromosomes may be delivered to plant cells, in particular 
plant seeds, by the use of electroporation and pollen to derive pollen 
comprising an artificial chromosome. Methods that may be used for delivery 
20 of artificial chromosomes into pollen include, for example, techniques 
described in U.S. Patent No. 5,049,500 and by Negrutiu era/, [in 
Biotechnology and Ecology of Pollen, Mulcahy era/, eds., (1986) Springer 
Verlag, N.Y., pp. 65-69] and Fromm eta/. [(1986) Nature 319:791; including 
methods for introducing DNA into mature pollen using various procedures 
25 such as heat shock, PEG and electroporation]. The pollen is capable of 
germinating and fertilizing an egg cell, leading to the formation of a plant 
seed comprising an artificial chromosome. 



30 particular artificial chromosomes , into plant cells overcomes the cell wall 



c. Physical methods 

The physical methods approach for introducing foreign DNA, in 
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barrier to DNA movement. Physical, or mechanical means, are used to 
introduce transgenes directly into protoplasts or plant cells and include, but 
are not limited to, microinjection, particle bombardment, and sonoporation. 

(1 ) Microinjection 

5 Microinjection involves the mechanical injection of heterologous DNA, 

in particular artificial chromosomes, into plant cells, including cultured cells 
and cells in intact plant organs and embryoids in tissue culture via very small 
micropipettes, needles, or syringes (Neuhaus et al. (1987)Theor. Appl Genet. 
75:30-36; Reich et aL (1986) Can. J. Bot. 64:1255-1258; Crossway et al. 

10 (1986) BioTechniques 4:320-334; Crossway et aL (1986) Mol. Gen. Genet. 
20:179; U.S. Patent No. 4,743,548; silicon carbide whiskers (Kaeppler et 
aL (1990) Plant Cell Rep. 9:415-418; Frame et aL (1994). For example, 
microinjection of protoplast cells with foreign DNA for transformation of plant 
cells has been reported for barley and tobacco (see, e.g., Holm et aL (2000) 

15 Transgenic Res. 3:21-32 and Schnorf et aL Transgenic Res. 7:23-30). Single 
artificial chromosomes may be front-loaded into microinjection needles and 
then injected into cells ("pick-and-inject") following procedures as described 
by Co et aL [(2000) Chromosome Res: 8:183-191]. 

(2) Particle bombardment 

20 Microprojectile bombardment (acceleration of small high density 

particles, which contain the DNA, to high velocity with a particle gun 
apparatus, which forces the particles to penetrate plant cell walls and 
membraneslhave also been used to introduce heterologous DNA into plant 
cells. Microprojectile bombardment techniques for the introduction of nucleic 

25 acids into plant cells, in addition to being an effective means of reproducibly 
stably transforming plant cells, particularly monocots, do not require isolation 
of protoplasts or susceptibility of the host cell to Agrobacterium infection. In 
these methods, nucleic acids are carried through the cell wall and into the 
cytoplasm on the surface of small, typically metal, particles (see, e.g., Klein 

30 era/. (1987) Nature 327:70; Klein et aL (1988) Proc. NatL Acad. ScL U.S.A. 
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55:8502-8505, Klein et a/, in Progress in Plant Cetfular and Moiecular 
Biology, eds. Nijkamp, H.J. J., Van der Plas, J.H.W., and Van Aartrijk, J., 
Kluwer Academic Publishers, Dordrecht. (1988), p. 56-66 and McCabe et at. 
(1988) Bio/Technology 6:923-926; Sautter et al. (1991) Biol. Technol. 
5 9:1080-1085; Gordon-Kamm et al. (1990) Plant Cell 2:603-618; Finer et al. 
(1999) Curr. Top. Microbiol. Immunol. 240:59-80; Vasil and Vasil (1999) 
Methods in Molecular Biology 1 1 1:349-358; Seki era/. (1999) Mo. 
Biotechnol. 11:251-255). Particles may be coated with nucleic acids and 
delivered into cells by a propelling force. Exemplary particles include those 
10 containing tungsten, gold or platinum, as well as magnesium sulfate crystals, 
jhe metal particles can penetrate through several layers of cells and thus 
allow the transformation of cells within tissue explants. 

In an illustrative embodiment (see, e.g., U.S. Patent No. 6,023,013) of 
a method for delivering foreign nucleic acids into plant cells, e.g. , maize 
1 5 cells, by acceleration, a Biolistics Particle Delivery System may be used to 
propel particles coated with DNA or cells through a screen, such as a 
stainless steel or Nytex screen, onto a filter surface covered with plant (e.g., 
corn) cells cultured in suspension. The screen disperses the particles so that 
they are not delivered to the recipient cells in large aggregates. The 
20 intervening screen between the projectile apparatus and the cells to be 

bombarded may reduce the size of projectile aggregates and may contribute 
to a higher frequency of transformation by reducing damage inflicted on the 
recipient cells by projectiles that are too large. 

For the bombardment, cells in suspension may be concentrated on 
25 filters or solid culture medium. Alternatively, immature embryos or other 

plant target cells may be arranged on solid culture medium. The cells to be 
bombarded are typically positioned at an appropriate distance below the 
microprojectile stopping plate. If desired, one or more screens may also be 
positioned between the acceleration device and the cells to be bombarded. 
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The prebombardment culturing conditions and bombardment 
parameters may be optimized to yield the maximum numbers of stable 
transformants. Both the physical and biological parameters for bombardment 
are important in this technology. Physical factors include those that involve 
5 manipulating the DNA/microprojectile precipitate or those that affect the 

flight and velocity of either the macro- or microprojectiles. Biological factors 
include all steps involved in manipulation of cells before and immediately 
after bombardment, the osmotic adjustment of target cells to help alleviate 
the trauma associated with bombardment, and also the nature of the 

lO transforming nucleic acid, such as linearized DNA, intact supercoiled 
plasmids, or artificial chromosomes. 

Physical parameters that may be adjusted include gap distance, flight 
distance, tissue distance and helium pressure. In addition, transformation 
may be optimized by adjusting the osmotic state, tissue hydration and 

15 subculture stage or cell cycle of the recipient cells. Ballistic particle 

acceleration devices are available from Agracetus, Inc. (Madison, Wl) and 
BioRad (Hercules, CA). 

Techniques for transformation of A188-derived maize line using 
particle bombardment are described in Gordon-Kamm eta/. (1990) Plant Cell 

20 2:603-618 and Fromm et al. (1990) Biotechnology 5:833-839. 

Transformation of rice may also be accomplished via particle bombardment 
(see, e.g., Christou era/. (1991) Biotechnology S:957-962). Particle 
bombardment may also be used to transform wheat (see, e.g. , Vasil et al. 
(1992) Biotechnology 70:667-674 for transformation of cells of type C long- 

25 term regenerate callus; and Weeks et al. (1993) Plant Physiol. 702:1077- 
1084 for transformation of wheat using particle bombardment of immature 
embryos and immature embryo-derived callus). The production of transgenic 
barley using bombardment methods is described, for example, by Koprek et 
al. (1996) Plant Sci. 7 73:79-91. 

30 (3) Sonoporation 
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Foreign DNA, in paticular artificial chromosomes, may be introduced 
into plant protoplasts using ultrasound treatment, in particular mild 
ultrasound treatment (10-IOOkHz), to create pores for DNA uptake (see e.g. 
International PCT application publication no. WO 91/00358) or may be 
5 introduced into plant protoplasts via a sonoporation machine (ImaRx 
Pharmaceutical Corp., Tucson, AZ). 

Alternatively, the delivery of artificial chromosomes into plant host 
cells is performed by any method described herein or well known in the art. 
For example, needle-like whiskers (US 5,302,523, 1994, US 5,464,765} 

10 have been used to delivery foreign DNA. 

Suitable plant targets into which foreign DNA, in particular artificial 
chromosomes, is transferred include, but are not limited to, protoplasts, cell 
culture cells, cells in plant tissue, meristem cells, microspores, callus, pollen, 
pollen tubes, microspores, egg-cells, embryo-sacs, zygotes or embryos in 

15 different stages of development, seeds, seedlings, roots, stems, leaves, 
whole plants, algae, or any plant part capable of proliferation and 
regeneration of plants, (see, e.g., U.S. Patent Nos. 5,990,390; 6,037,526 
and 5,990,390). The growth of the transformed plant targets described 
herein can done with tissue-culture or non-tissue culture methods, with the 

20 preferred methods being tissue culture methods. 

All plant cells into which foreign DNA, in particular artificial 
chromosomes, are introduced and that is regenerated from the transformed 
cells are used directly for expressed purposes (e.g. herbicide resistance, 
insect/pest resistance, disease resistance, environmental/stress resistance, 

25 nutrient utilization, male sterility, improved nutritional content, production of 
chemicals or biologicals, non-protein expressing sequences, and preparation 
and screening of libraries) as described herein or are used to produce 
transformed whole plants for the applications and uses described herein. The 
particular protocol and means for the introduction of the artificial 
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chromosome into the plant host is adapted or refined to suit the particular 
plant species or cultivar. 

Chromosomes may be transferred to cells by microcell mediated 
chromosome transfer (MMCT) (Telenius et aL, Chromosome Research 7:3-7, 
5 1999; Ramulu et aL, Methods in Molecular Biology 111: 227-242, 1999). In 
general, donor plant cultures or donor mammalian cell cultures are incubated 
in media supplemented with reagents that inhibit DNA synthesis (e.g., 
hydroxy urea, aphidicolin) and/or reagents that inhibit attachment of 
chromosomes to the mitotic spindle (e.g.,colcemid, colchicines, arniprophos- 

10 methyl, cremart). The cell walls of plant cells are digested with enzymes 
(e.g., cellulase, maceroenzyme) producing protoplasts. Donor plant 
protoplasts or donor mammalian cells are loaded on a Percoll gradient in the 
presence of cytochalasin-B (which causes the cell cytoskeleton to 
depolymerize into monomer protein subunits) and centrifuged at 10 5 x g. 

15 During centrifugation the metaphase chromosomes are extruded through the 
plasma membrane forming plant 'microprotoplasts' or mammalian 
'microcells.' The microprotoplasts/microcells are filtered through nylon 
sieves of decreasing pore size (8-3 jjm) to isolate smaller ones that contain 
predominately 1 metaphase chromosome. The microprotoplasts/microcells 

20 are fused to recipient plant protoplasts or mammalian cells by polyethelene 
glycol (peg) treatment. The fusion mixture is cultured in appropriate media. 
If the chromosome of interest is expressing a selection marker gene the 
fusion mixtures may be cultured in appropriate media supplemented with the 
appropriate selection drug (e.g. hygromycin, kanamycin). 

25 2. The growth of transformed plant host cells 

In tissue culture methods, plant cells or protoplasts transformed by the 
chemical, physical, electrical methods described herein are grown, or 
cultured, under selective conditions. The selective markers are integrated 
into the heterologous DNA, in particular artificial chromosome, before its 

30 introduction to plant hosts or are integrated into the plant host after 
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transfection. An additional marker can be used for double selection. 
Generally, the plant cells or protoplasts are grown for numerous generations, 
after which the transformed cells are identified. 

The transformed cells are subjected to conditions known in the art for 
5 callus initiation. Tissue that develops during the initiation period is placed in 
a regeneration or selection medium where shoot and root development occur. 
The plantlets are analyzed for the determination of transformation 
(International PCT application publication no. WO 00/60061). In the case of 
maize, embryonic callus cultures are initiated from immature maize embryos, 

10 bombarded with genes, and transfdrmed into plantlets by the methods 

described in International PCT application publication no. WO 00/6O061. In 
tissue culture methods. Rice calli are transformed with DNA encoding 
insecticidal proteins CrylA(b) and CrylA(c) for insect resistance. Common 
tissue culture methods can also be used to transform tobacco and tomato 

15 (see, e.g. . US Patent No. 6,136,320), embryogenic maize calli (US Pat. Nos. 
5,508,468; 5,538,877; 5,538,880; 5,780,708; 6,013,863; 5,554,798; 
5,990,390; and 5,484,956;) and other crop species, e.g., potato and 
tobacco (Sijmons et al. (1990) Bio/Technol 8:217-221; tobacco 
(Vanderkerckhove et al. (1989) Bio/Technol 7:929-932 and Owen and Pen 

20 eds. Transgenic Plants: A Production System for industrial and 

Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996) and rice 
(Zhu et al. (1994) Plant Cell Tiss Org Cult 36:197-204). 
3. Analysis of transformed plant host cells 

Once foreign DNA, in particular artificial chromosomes, is introduced 
25 into plant hosts and the cells or protoplasts are grown and developed under 
the conditions described herein, the plant cells or protoplasts which were 
transformed with artificial chromosomes are identified. The plant cell, 
protoplast, callus, leaf disc, or other plant target are screened for the 
presence of artificial chromosomes by various methods well known in the art 
30 including, but not limited to, assays for the expression of reporter genes. 
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PCR of the isolated plant chromosomes or DNA, electron microscopy, 
visualization methods, and in situ hybridization of chromosome painting 
probe as described herein. Moreover, cells treated with artificial 
chromosomes are isolated during metaphase using a mitotic arrest agent, 
5 such as colchicine, and the artificial chromosome are distinguished from 
endogenous chromosomes by fluorescence-activated cell sorting, size and 
density differences, or by any method well known in the art. Alternatively, 
when a selectable marker gene is transmitted with or as part of the artificial 
chromosome, selective agents are used to detect the expression of the 

lO selectable marker (International PCT application publication no. WO 

00/60061; US Patent No. 6,136,320; Owen and Pen Eds. Transgenic Plants: 
A Production System for Industrial and Pharmaceutical Proteins). Enzymatic 
assays, immunological assays, bioassays, germination assays, or chemical 
assays are used to assess the phenotypic effects of artificial chromosomes 

1 5 such as insect or fungal resistance or any other expression of genes in 

artificial chromosomes (Cheng et al. (1998) 95:2767-2772; US Patent No. 
6,126,320; International PCT application publication no. WO 00/60061; 
Owen and Pen eds. Transgenic Plants: A Production System for Industrial 
and Pharmaceutical Proteins, John Wiley & Sons, Chichester, 1996). The 

20 plant cells, protoplasts, or other plant hosts that are successfully transformed 
with artificial chromosomes are used directly to express the gene of interest 
or are used to generate transgenic plants. 

Fluorescent in situ hybridization (FISH) may be used to screen for the 
transfer of artificial chromosomes into plant cells. Using DNA probes specfic 

25 for the artificial chromosome (e.g., mouse major satellite DNA probe for 
murine satellite DNA based artificial chromosomes; or a kanamycin, 
hygromycin or GUS gene DNA probe for a plant artificial chromosome 
carrying such a gene) standard FISH techniques for plant cells have been 
described (de Jong et aL, Trends in Plant Science 4: 258-263, 1999). 
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i 

IdU labeling can be used to determine the optimum conditions for 
chromosome transfer (microcells) or isolated artificial chromosomes. The 
incorporated IdU increases the fragility of the chromosome and will increase 
the probability of cellular mutation. Hence, the cells are fixed within 48- 
5 hours after transfection/f usion and analyzed for chromosome uptake using 
various procedures. Once the optimum transfer conditions have been 
determined, long-term expression experiments are performed with unlabeled 
artificial chromosomes or microcells. 
H. Re-generation of transgenic plants 

10 Plants containing artificial chromosomes are generated from plant 

cells, protoplasts, calli, or other plant tissue targets into which foreign DNA, 
in particular artificial chromosomes, have been introduced. Regeneration 
techniques for many commercially important plant species are well-known in 
the art. The artificial chromosome that is inserted into plant hosts to 

15 produce transgenic plants are PACs or MACs. 

Plants are re-generated by the planting of transformed roots, plantlets, 
seeds, seedlings and structures capable of growing into a whole plant 
capable of reproduction (see, e.g., US Patent Nos. 6,136,320 and 
International PCT application No. WO 00/60061). The re-generation of maize 

20 plants from transformed protoplasts is found, for example, in European 

Patent Application nos. O 292 435 and O 392 225 and International PCT 
Application Publication no. WO 93/07278; the regeneration of rice following 
gene transfer is found in Zhang eta/. (1988) Plant Cell Rep. 7:379-384; 
Shimamoto et al. (1989) Nature 338:274-277; Datta et al. (1990) 

25 Biotechnology 5:736-740; and the re-generation of fertile transgenic barley 
by direct DNA transfer to protoplasts is described by Funatsuki et al. (1995) 
Theor. Appl. Genet. 3/:707-712. Alternatively, plants containing artificial 
chromosomes are obtained by crossing a plant containing an artificial 
chromosome with another plant to produce plants having an artificial 

30 chromosome in their genomes (see e.g. US Patent No. 6,150,585). 
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Plants containing an artificial chromosome are propagated through 
seed, cuttings, or vegetatively. The seed from plants containing an artificial 
chromosome are grown in the field, in pots, indoors, outdoors, in 
greenhouses, on glass, or in or on any suitable medium, and the resulting 
5 sexually mature transgenic plants are self-pollinated to generate true breeding 
plants. The progeny from these transgenic plants become true breeding lines 
(International PCT application publication Nos. WO 00/60061 and EP 
1017268; US Patent Nos. 5,631,152; 5,955,362; 6,015,940; 6,013,523; 
6,096,546; 6,037,527; 6,153,812; Weissbach and Weissbach (1988) 
10 Methods for Plant Molecular Biology, Academic Press, Inc.; Fromm eta/. 
(1990) Bio/Technology 8:833-839; Gordon-Kamm era/. (1990) Plant Cell 
2:603-608; Koziel et aL (1993) Bio/Technology 11:194-200; and Golovkin et 
aL (1993) Plant Sci. 90:41-52). 
1. PACs 

15 Plant artificial chromosomes (PACs) are prepared by the in vivo and in 

vitro methods described herein. PACs may be prepared inside plant 
protoplasts and then transferred to plant targets, in particular to other plant 
protoplasts, via fusion in the presence or absence of PEG as described herein 
(Draper era/. (1982) Plant Cell Physiol. 23:451-458; Krens et aL (1982) 

20 Nature 72-74). PACs arfe isolated from the protoplasts in which they were 
prepared, encapsulated into liposomes, and delivered to other plant 
protoplasts (Deshayes eta/. (1985) EMBO J. 4:2731-2737). Alternatively, 
the PACs are isolated and delivered directly to plant protoplasts, plant cells, 
or other plant targets via a PEG-mediated process, calcium phosphate- 

25 mediated process, electroporation, microinjection, sonoporation, or any 

method known in the art as described herein (Haim eta/. (1985) Mol. Gen. 
Genet. 199:161-168; Fromm eta/. (1986) Nature 319:791-793; Fromm et 
aL (1985) Proc. Nat. Acad. Sci. USA 82:5824-5828; Klein et aL (1987) 
Nature 327:70; Klein et aL (1988) Proc. Nat. Acad. Sci. USA 85:8502-8505; 

30 and International PCT application publication no. WO 91/00358). 
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2. MACs 

Mammalian artificial chromosomes (MACs) are prepared by the in vivo 
and in vitro methods described in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application No. WO 97/40183. MACs are prepared as 
5 microcells, and the microcells are fused with plant protoplasts in the 

presence or absence of PEG (Dudits et al. (1976) Hereditas 82:121-123; 
Wiegland et a/. (1987) J. Cell. Sci. Pt. 2 145-149). Alternatively, the MACs 
are isolated and delivered directly to plant cells, protoplasts, and other plant 
targets a PEG-mediated process, calcium phosphate-mediated process, 

10 electroporation, microinjection, sonoporation , or any method known in the 
art as described herein and in US Patent Nos. 6,025,155 and 6,077,697, 
and International PCT application publication No. WO 97/40183. 

After PACs or MACs are introduced into plant targets and the plant 
targets are grown and analyzed for transf ection, the transformed plant 

15 targets are developed using standard conditions into roots, shoots, plantlets, 
or any structure capable of growing into a plant. Transgenic plants can, in 
turn, be generated by the planting of transformed roots, plantlets, seeds, 
seedlings and structures capable of growing into a plant. Transgenic 
plants can be propagated, for example, through seed, cuttings, or vegetative 

20 propagation. 

I. Applications and Uses of Artificial Chromosomes 

Artificial chromosomes provide convenient and useful vectors, and in 
some instances [e.g., in the case of very large heterologous genes) the only 
vectors, for introduction of heterologous genes into hosts. Virtually any 

25 gene of interest is amenable to introduction into a host via artificial 
chromosomes. 

As described herein, there are numerous methods for using artificial 
chromosomes to introduce coding sequences into plant cells. These include 
methods for using artificial chromosomes to express genes encoding 
30 commerically valuable enzymes and therapeutic compounds in plant cells. 
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introduction of agronomically important traits or applications related to the 
manipulation of large regions of DNA. 

The artificial chromosomes provided herein may be used in methods of 
protein and gene product production, particularly using plant cells as host 
5 cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means for optimizing the biomanufacturing of important compounds for 
medicine and industry. They are also intended for use in methods of gene 
therapy and for production of transgenic organisms, particularly plants 
10 (discussed above, below and in the EXAMPLES). 

1 . Production of products in plants 

Methods for expression of heterologous proteins in plant cells 
("molecular farming") are provided. At present, many foreign proteins have 
been expressed in whole plants or selected plant organs. Plants can offer a 

15 highly effective and economical means to produce recombinant proteins as 
they can be grown on a large scale at modest cost. The production of 
heterologous proteins in plants has included genes that are fused to strong 
constitutive plant promoters (e.g., 35S from cauliflower mosaic virus 
(Sijmons et al., 1990, Bio/Technology, 8:217-221, Benfey and Chua, US 

20 5,110,732, Fraley et al., US 5,858,742, McPherson and Kay, US 

5,359,142); seed specific promoters (Hall et al., US 5,504,200, Knauf et al., 
US 5,530,194, Thomas et al., US 5,905,186, Moloney, US 5,792,922, US 
5,948,682) or promoters active in other plant organs such as fruit (Radke et 
al., 1988, Theoret. Appl. Genet., 75:685-694, Bestwick et al., US 

25 5,783,394, Houck and Pear, US 4,943,674) or storage organs such as 

tubers (Rocha-Sosa et al., US 5,436,393, US 5,723,757). The genes under 
the control of these promoters can be any protein and include, for example, 
genes that encode receptors, cytokines, enzymes, proteases, hormones, 
growth factors, antibodies, tumor suppressor genes, vaccines, therapeutic 

30 products and multigene pathways. 
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For example, industrial enzymes that can be produced include, for 
example, cr-amylase, glucanase, phytase and xylanase (see, Goddijn and Pen 
(1995) Trends Biotechnol. 73:379-387; Pen era/. (1992) Bio/Technology 
70:292-296; Horvath era/. (2000) Proc. Natl. Acad. Sci. U.S.A. 37:1914- 
5 1919; and e.g., Herbers and Sonnewald (1996) in Transgenic Plants: A 

Production System for industrial and Pharmaceutical Proteins" Owen and Pen 
Eds., John Wiley & Sons, West Sussex, England), proteases such as 
subtilisin and other industrially important enzymes. Additional proteins that 
can be produced in crops by molecular farming include other industrial 

10 enzymes, for example, proteases, carbohydrate modifying enzymes such as 
glucose oxidase, cellulases, hemicellulases, xylanases, mannanases or 
pectinases, (e.g. Baszczynski et ah, US 5,824,870, US 5,767,379, Bruce et 
al., US 5,804,694). Additionally, the production of enzymes particularly 
valuable in the pulp and paper industry such as ligninases or xylanases also 

15 can be expressed, (Austin-Philips et al., US 5,981,835). Other examples of 
enzymes include phosphatases, oxidoreductases and phytases, (van Ooijen 
et al., US 5,714,474). 

Additionally, expression and delivery of vaccines in plants has been 
proposed(Arntzen and Lam, US 6,136,320, US, 5,914,123, Curtiss and 

20 Cardineau, US 5,679,880, US 5,679,880, US 5,654,184, Lam and Arntzen, 
US 5,612,487, US 6,034,298, Rymerson et al.. W09937784A1, as well as 
antibodies (Conrad et al., WO 972900A1, Hein et al., US 5,959,177, Hiatt 
and Hein, US 5,202,422, US 5,639,947, Hiatt et al., US 6,046,037), 
peptide hormones (Vandekerckhove, J.S., US 5,487,991, Brandle et al., 

25 W09967401 A2), blood factors and similar therapeutic molecules. 

Expression of vaccines in edible plants can provide a means for drug delivery 
which is cost effective and particularly suited for the administration of 
therapeutic agents in rural or under developed countries. The plant material 
containing the therapeutic agents could be cultivated and incorporated into 

30 the diet (Lam, D.M., and Arntzen, C.J., US 5,484,719). Similarly, plants 



WO 2002/096923 



PCT/US2002/017451 



-131- 

used for animal feed can be engineered to express veterinary biologies that 
can provide protection against animal disease, (Rymerson et al., 
W09937784A1 ). Antibodies also can be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
5 (scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 

Bio/Technology 13: 1 090-1 093) and IgG (Ma et al. (1995) Science 268:716- 
719). Monoclonal antibodies for therapeutic and diagnostic applications are 
of particular interest. 

Examples of human biopharmaceuticals that can be expressed in 

10 plants include, but are not limited to, albumin (Sijmons et at. (1990)), 

enkephalins (Vandekerckhove et al. (1989) ), interferon-a (Zhu et al. (1994) 
and GM-CSF (Ganz et al. (1996) in Transgenic Plants: A Production System 
for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et al. (1998) in 

1 5 Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 

Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

Cells containing the artificial chromosomes provided herein can 
advantageously be used in in vitro plant cell-based systems for production of 

20 proteins, particularly several proteins from one cell line, such as multiple 
proteins involved in a biochemical pathway or multivalent vaccines. The 
genes encoding the proteins are introduced into the artificial chromosomes 
which are then introduced into plant cells. Plant cells useful for this purpose 
are those that grow well in culture, or most preferably, plant cells capable of 

25 being regenerated to whole plants. Plants can then be cultivated by common 
methods to produce plant material comprising said heterologous proteins. 
The heterologous proteins can be subject to purification or the plant tissue or 
extracts thereof can be used directly for vaccination, amelioration of disease, 
or processing of material, such as bleaching during pulp and paper 

30 processing or enzymatic conversion of industrial materials or feedstocks. 
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Alternatively, the heterologous gene(s) of interest are transferred into a 
production cell line or plant line that already contains artificial chromosomes 
in a manner that targets the gene(s) to the artificial chromosomes. The cells 
or plants are grown under conditions whereby the heterologous proteins are 
5 expressed. Because the proteins are expressed at high levels in a stable 

permanent extra-genomic chromosomal system, selective conditions are not 
required. 

Selection of host lines for use in artificial chromosome-based protein 
production systems is within the skill of the art, but often will depend on a 

lO variety of factors, including the properties of the heterologous protein to be 
produced, potential toxicity of the protein in the host cell, any requirements 
for post-translational modification ( e.g. , glycosylation, amination, 
phosphorylation) of the protein, transcription factors available in the cells, 
the type of promoter element(s) being used to drive expression of the 

15 heterologous gene, whether production is completely intracellular or the 
heterologous protein will preferably be secreted from the cell, or be 
sequestered or localized, and the types of processing enzymes in the cell. 

Artificial chromosomes can be engineered as platforms for the 
production of specific molecules in plant cells. For example, production of 

20 complex mammalian molecules, such as multichain antibodies, requires a 
number of protein activities not normally found in plant species. It is 
possible to produce an artificial chromosome that comprises all of the 
mamalian activities needed to produce human antibodies, correctly modified 
and processed, by introducing into an artificial chromosome the genes 

25 needed to carry out these activities. Said genes would be modified, for 

example, by placing each gene under the control of a plant promoter, or by 
placing the master control gene, i.e., a gene that controls expression of the 
various genes, under the control of a plant promoter. Alternatively, 
mammalian transcriptional control factors could be introduced, under the 
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control of plant active promoters, to be expressed in a plant cell and cause 
the expression of said target proteins, for example multichain antibodies. 

In this fashion, plant artificial chromosomes are developed, each 
capable of supporting the efficient production of a specific class of valuable 
5 products, for example, antibodies, blood clotting factors, etc. Thus, 

production of products within a class, for example, human antibodies would 
simply involve the introduction of a specific antibody coding sequence, 
without modification into the artificial chromosome engineered specifically for 
the production of human antibodies. The artificial chromosome would 
lO comprise all of the required genetic activities for the proper expression, 

translation and post-translational modification of human antibodies. Such 
artificial chromosomes can be used in a variety of applications, such as, but 
are not limited to, large scale production of numerous specific human 
antibodies. 

15 Advantages of plant cells as host cell lines in the production of 

recombinant proteins include, but are not limited to, the following: (1) 
proteins are post-translationally modified similar to mammalian systems, (2) 
plants can be directed to secrete proteins into stable, dry, intracellular 
compartments of seeds called endosperm protein bodies, which can easily be 

20 collected, (3) the amount of recombinant product that can be produced 

approaches industrial scale levels and (4) health risks due to contamination 
with potential pathogens/toxins are minimized. 

The artificial chromosome-based system for heterologous protein 
production has many advantageous features. For example, as described 

25 above, because the heterologous DNA is located in an independent, extra- 
genomic artificial chromosome (as opposed to randomly inserted in an 
unknown area of the host cell genome or located as extrachromosomal 
element(s) providing only transient expression), it is stably maintained in an 
active transcription unit and is not subject to ejection via recombination or 

30 elimination during cell division. Accordingly, it is unnecessary to include a 
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selection gene in the host cells and thus growth under selective conditions is 
also unnecessary. Furthermore, because the artificial chromosomes are 
capable of incorporating large segments of DNA, multiple copies of the 
heterologous gene and linked promoter element(s) can be retained in these 
5 chromosomes, thereby providing for high-level expression of the foreign 

protein(s). Alternatively, multiple copies of the gene can be linked to a single 
promoter element and several different genes can be linked in a fused 
polygene complex to a single promoter for expression of, for example, all the 
key proteins constituting a complete metabolic pathway (see, e.g. . Beck von 

10 Bodman et aL (1995) Biotechnology J_3:587-591). Alternatively, multiple 
copies of a single gene can be operatively linked to a single promoter, or 
each or one or several copies can be linked to different promoters or multiple 
copies of the same promoter. Additionally, because artificial chromosomes 
have an almost unlimited capacity for integration and expression of foreign 

15 genes, they can be used not only for the expression of genes encoding end- 
products of interest, but also for the expression of genes associated with 
optimal maintenance and metabolic management of the host cell, e.g. , genes 
encoding growth factors, as well as genes that facilitate rapid synthesis of 
correct form of the desired heterologous protein product, e.g. , genes 

20 encoding processing enzymes and transcription factors as described above. 

The artificial chromosomes are suitable for expression of any proteins 
or peptides, including proteins and peptides that require in vivo 
posttranslational modification for their biological activity. Such proteins 
include, but are not limited to antibody fragments, full-length antibodies, and 

25 multimeric antibodies, tumor suppressor proteins, naturally occurring or 
artificial antibodies and enzymes, heat shock proteins, and others. 

Thus, such cell-based "protein factories" employing artificial 
chromosomes can be generated using artificial chromosomes constructed 
with multiple copies (theoretically an unlimited number or at least up to a 

30 number such that the resulting artificial chromosome is about up to the size 
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of a genomic chromosome (i.e., endogenous)) of protein-encoding genes with 
appropriate promoters, or multiple genes driven by a single promoter, i.e., a 
fused gene complex (such as a complete metabolic pathway in plant 
expression system; see; e.g. . Beck von Bodman (1995) Biotechnology 
5 1_3:587-591). Once such an artificial chromosome is constructed, it can be 
transferred to a suitable plant species capable of being propagated under 
field conditions, or under conditions that permit the recovery of the intended 
product. Plant cell cultures such as algae can be used in a system analogous 
to mammalian cell culture systems. The advantage of plant based systems 

10- such as this include low input costs for growth, rapid growth rates and 
ability to produce a large biomass economically. 

The ability of artificial chromosomes to provide for high-level 
expression of heterologous proteins in host cells is demonstrated, for 
example, by analysis of mammalian cells containing a mammalian artificial 

15 chromosome, H1D3 and G3D5 cell lines described herein. Northern blot 
analysis of mRNA obtained from these cells reveals that expression of the 
hygromycin-resistance and fi -galactosidase genes in the cells correlates with 
the amplicon number of the megachromosome(s) contained therein. 

Transgenic plants producing these compounds are made by the 

20 introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 
intermediary metabolites, carbohydrate polymers, enzymes for uses in 

25 bioremediation, enzymes for modifying pathways that produce secondary 

plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 
plastics. The compounds are roduced by the plant, extracted upon harvest 

30 and/or processing, and used for any presently recognized useful purpose 
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such as pharmaceuticals, fragrances, and industrial enzymes. Alternatively, 
plants produced in accordance with the methods and compositions provided 
herein can be made to metabolize certain compounds, such as hazardous 
wastes, thereby allowing bioremediation of these compounds. 
5 The artificial chromosomes provided herein can be used in methods of 

protein and gene product production, particularly using plant cells as host 
cells for production of such products, and in cellular production systems in 
which the artificial chromosomes provide a reliable, stable and efficient 
means'for optimizing the biomanufacturing of important compounds for 

lO medicine and industry. 

2. Genetic alteration of organisms to possess desired traits 
Artificial chromosomes are ideally suited for preparing organisms, such 
as plants, that possess certain desired traits, such as, for example, disease 
resistance, resistance to harsh environmental conditions, altered growth 

15 patterns and enhanced physical characteristics. With respect to plants, the 

choice of the particular nucleic acid that will be delivered to recipient cells via 
artificial chromosomes often will depend on the purpose of the 
transformation. One of the major purposes of transformation of crop and 
tree species is to add some commercially desirable, agronomicaliy important 

20 traits to the plant. Such traits include, but are not limited to, input and 

output traits such as herbicide resistance or tolerance, insect resistance or 
tolerance, disease resistance or tolerance (viral, bacterial, fungal or 
nematode), stress tolerance and/or resistance, as exemplified by resistance 
or tolerance to drought, heat, chilling, freezing, excessive moisture, salt 

25 stress and oxidative stress, increased yields, food content and makeup, 

physical appearance, male sterility, drydown, standability, prolificacy, starch 
quantity and quality, oil quantity and quality, protein quantity and quality and 
amino acid composition. It may be desirable to incorporate one or more 
genes conferring such desirable traits into host plants. 
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a. Herbicide resistance 

The genes encoding phosphinothricin acetyltransferase (bar and pat), 
glyphosate tolerant EPSP synthase genes, the glyphosate degradative 
enzyme gene gox encoding glyphosate oxidoreductase, deh (encoding a 
5 dehalogenase enzyme that inactivates dalapon), herbicide resistant 

(e.g. ^sulfonylurea and imidazolinone) acetolactate synthase, and bxn genes 
(encoding a nitrilase enzyme that degrades bromoxynil) are all examples of 
herbicide resistant genes for use in plant transformation. The bar and pat 
genes code for an enzyme, phosphinothricin acetyltransferase (PAT), which 

10 inactivates the herbicide phosphinothricin and prevents this compound from 
inhibiting gluatamine synthetase enzymes. The enzyme 5- 
enolpyruvylshikimate 3-phosphate synthase (EPSP synthase) is normally 
inhibited by the herbicide N-(phosphonomethyl)glycine (glyphosate). 
However, genes are known that encode glyphosate-resistant EPSP synthase 

15 enzymes. The deh gene encodes the enzyme dalapon dehalogenase and 
confers resistance to the herbicide dalapon. The bxn gene codes for a 
specific nitrilase enzyme that converts bromoxynil to a non-herbicidal 
degradation product. 

b. Insect and other pest resistance 

20 Insect-resistant organisms may be prepared in which resistance or 

decreased susceptibility to insect-induced disease is conferred by 
introduction into the host organism or embryo of artificial chromosomes 
containing DNA encoding gene products (e.g., ribozymes and proteins that 
are toxic to certain pathogens) that destroy or attenuate pathogens or limit 

25 access of pathogens to the host. Potential insect resistance genes that can 
be introduced into plants via artificial chromosomes include Bacillus 
thuringiensis crystal toxin genes or Bt genes (see, e.g.,, Watrud eta/. (1985) 
in Engineered Organisms and the Environment). Bt genes may provide 
resistance to lepidopteran or coleopteran pests such as the European Corn 

30 Borer (ECB). Such Bt toxin genes include the CryfA(b) and CrylA(c) genes. 
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Endotoxin genes from other species of B. thuringiensis which affect insect 
growth or development also may be employed in this regard. Bt gene 
sequences can be modified to effect increased expression in plants, and 
particularly monocot plants. Means for preparing synthetic genes are well 
5 known in the art and are disclosed in, for example, U.S. Patent Nos. 
5,500,365 and 5,689,052. Examples of such modified Bt toxin genes 
include a synthetic Bt CrylA(b) gene (see, e.g., Perlak era/. (1991) Proc. 
Natl. Acad. Sci. U.S.A. 88:3324-3328) and the synthetic CrylA(c) gene 
termed 1800b (see PCT Application publication no. WO95/06128). 

TO Examples of the types of genes that may be transferred into plants via 

artificial chromosomes to generate disease- and/or insect-resistant transgenic 
plants include, but are not limited to, the crylA(b) and cry/A fcj genes which 
yield products that are highly toxic to two major rice insect pests (the striped 
stem borer and the yellow stem borer) (see, e.g., Cheng et a/. (1998) Proc. 

15 Natl. Acad. Sci. U.S.A. 95:2767-2772), cry 3 genes which encode products 
that are toxic to Coleopteran insects that attack a variety of plants, including 
grains and legumes (see, e.g., U.S. Patent No. 6,023,013), genes {e.g., DNA 
encoding tricothecene 3-O-acetyltransferase) that confer resistance to 
tricothecenes such as those produced by plant fungi [e.g., Fusarium) in 

20 plants particularly susceptible to fungi (e.g., wheat, rye, barley, oats, and 

maize) (see, e.g., PCT Application publication no. WO 00/60061), and genes 
involved in multi-gene biosynthetic pathways that yield antipathogenic 
substances that have a deleterious effect on the growth of plant pathogens 
(see, e.g., U.S. Patent No. 5,639,949). 

25 Protease inhibitors may also provide insect resistance (see, e.g., 

Johnson et aL (1989) and will thus have utility in plant transformation. The 
use of a protease inhibitor II gene, pin//, from tomato or potato may be 
particularly useful. The combined effect of the use of a pin// gene with a Bt 
toxin gene can produce synergistic insecticidal activity. Other genes that 

30 encode inhibitors of the insect's digestive system, or those that encode 
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enzymes or co-factors that facilitate the production of inhibitors, also may be 
useful. This group may be exemplified by oryzacystatin and amylase 
inhibitors such as those from wheat and barley. 

Genes encoding lectins may confer additional or alternative insecticide 
5 properties. Lectins (originally termed phytohemagglutinins) are multivalent 
carbohydrate-binding proteins which have the ability to agglutinate red blood 
cells from a range of species. Lectins have been identified as insecticidal 
agents with activity against weevils, ECB and rootworm (see, e.g. , Murdock 
eta/. (1990) Phytochemistry 25:85-89; Czapla & Lang (1990) J. Econ. 

10 Entomol. 53:2480-2485). Lectin genes that may be useful include, for 
example, barley and wheat germ agglutinin (WGA) and rice lectins 
(Gatehouse eta/. (1984) J. Scf. Food. Agric. 35:373-380). 

Genes controlling the production of large and small polypeptides active 
against insects when introduced into the insect pests, such as, for example, 

15 lytic peptides, peptide hormones and toxins and venoms, may also be useful 
in generating pest-resistant plants. For example, expression of juvenile 
hormone esterase, directed toward specific insect pests, also may result in 
Insecticidal activity, or cause cessation of metamorphosis (see, e.g., 
Hammock eta/. (1990) Nature 344:458-461). 

20 Transgenic plants expressing genes which encode enzymes that affect 

the integrity of the insect cuticle are additional examples of genes that may 
be transferred to plants via artificial chromosomes to confer resistance to 
insects. Such genes include those encoding, for example, chitinase, 
proteases, lipases and also genes for the production of nikkomycin, a 

25 compound that inhibits chitin synthesis, the introduction of any of which 
may be used to produce insect-resistant plants. Genes that affect insect 
molting, such as those affecting the production of ecdysteroid UDP-glucosyl 
transferase, also can be useful transgenes. 

Genes that code for enzymes that facilitate the production of 

30 compounds that reduce the nutritional quality of the host plant to insect 
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pests may also be used to confer insect resistance on plants. It may be 
possible, for instance, to confer insecticidal activity on a plant by altering its 
sterol composition. Sterols are obtained by insects from their diet and are 
used for hormone synthesis and membrane stability. Therefore, alterations in 
5 plant sterol composition by expression of genes that directly promote the 
production of undesirable sterols or those that convert desirable sterols into 
undesirable forms, could have a negative effect on insect growth and/or 
development and hence endow the plant with insecticidal activity. 
Lipoxygenases are naturally occurring plant enzymes that have been shown 

lO to exhibit anti-nutritional effects on insects and to reduce the nutritional 
quality of their diet. Therefore, transgenic plants with enhanced 
lipoxygenase activity may be resistant to insect feeding. 

Tripsacum dactyloides is a species of grass that is resistant to certain 
insects, including corn root worm. Tripsacum may thus include genes 

15 encoding proteins that are toxic to insects or are involved in the biosynthesis 
of compounds toxic to insects. Such genes may be useful in conferring 
resistance to insects. It is known that the basis of insect resistance in 
Tripsacum is genetic, because said resistance has been transferred to Zea 
mays via sexual crosses {Branson and Guss, 1972). It is further anticipated 

20 that other cereal, monocot or dicot plant species may have genes encoding 
proteins that are toxic to insects which would be useful for producing insect 
resistant plants. 

Further genes encoding proteins characterized as having potential 
insecticidal activity also may be used as transgenes in accordance herewith. 

25 Such genes include, for example, the cowpea trypsin inhibitor (CpT1: Hilder 
etaL, 1987) which may be used as a rootworm deterrent, genes encoding 
avermectin (Avermectin and Abamectin., Campbell, W.C., Ed., 1989: Ikeda 
era/., 1987) which may prove particularly useful as a corn rootworm 
deterent, ribosome inactivating protein genes and even genes that regulate 

30 plant structures. Transgenic plants including anti-insect antibody genes and 
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genes that code for enzymes that can convert a non-toxic insecticide (pro- 
insecticide) applied to the outside of the plant into an insecticide inside the 
plant also are contemplated. 

c. Disease resistance 
5 Transgenic organisms, such as plants, that express genes that confer 

resistance or reduce susceptibility to disease are of particular interest. For 
example, the transgene may encode a protein that is toxic to a pathogen, 
such as a virus, fungus, mycotoxin-producing organism, nematode or 
bacterium, but that is not toxic to the transgenic host. 

10 Because multiple genes can be introduced on an artificial 

chromosome, a series of genes encoding a genetic pathway involved in 
disease resistance or tolerance can be introduced into crop plants. For 
example, it is known that often numerous genes are expressed upon 
pathogen invasion, typically one or more n PR", or pathogen related, proteins 

15 are expressed in response to invasion of a plant bacterial or fungal pathogen. 
One or more of the proteins involved in conferring resistance to pathogens 
can be contained within an artificial chromosome and therefore be expressed 
in a plant cell, in particular a whole transgenic plant as described herein. In 
addition, production of single-chain Fv recombinant antibodies in plants may 

20 extend the range of possibilities for the introduction of pathogen protection 
in crop plants (see, e.g. , Tavladoraki et al. (1993) Nature 355:469-472). 

It has been demonstrated that expression of a viral coat protein in a 
transgenic plant can impart resistance to infection of the plant by that virus 
and perhaps other closely related viruses (Cuozzo et al., 1988. Hemenway et 

25 a/., 1988, Abe! era/., 1986). Expression pf antisense genes targeted at 

essential viral functions may also impart resistance to viruses. For example, 
an antisense gene targeted at the gene responsible for replication of viral 
nucleic acid may inhibit replication and lead to resistance to the virus. 
Interference with other viral functions through the use of antisense genes 

30 also may increase resistance to viruses. Further, it may be possible to 
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achieve resistance to viruses through other approaches, including, but not 
limited to the use of satellite viruses. Artificial chromosomes are ideally 
suited for carrying a multiplicity of these genes and DNA sequences which 
are useful for conferring a broad range of resistance to many pathogens. 
5 Genes encoding so-called "peptide antibiotics/* pathogenesis related 

(PR) proteins, toxin resistance, and proteins affecting host-pathogen 
interactions such as morphological may also be useful, particularly in 
conferring increased resistance to diseases caused by bacteria and fungi. 
Peptide antibiotics are polypeptide sequences which are inhibitory to growth 

10 of bacteria and other microorganisms. For example, the classes of peptides 
referred to as cepropins and magainins inhibit growth of may species of 
bacteria and fungi. Expression of PR proteins in monocotyledonous plants 
such as maize may be useful in conferring resistance to bacterial disease. 
These genes are induced following pathogen attack on a host plant and have 

15 been divided into at lease five classes of proteins (Bio. Linthorst, and 

Cornelissen, 1990). Included among the PR proteins are/9-1, 3-glucanases # 
chitinases, and osmotin and other proteins that are believed to function in 
plant resistance to disease organisms. Other genes have been identified that 
have antifungal properties, e.g.. UDA (stinging nettle lectin) and hevein 

20 (Broakaert et al. r 1989; Barkai-Golan et a/., 1978). It is known that certain 
plant diseases are caused by the production of phytotoxins. Resistance to 
these diseases may be achieved through expression of a gene that encodes 
an enzyme capable of degrading or otherwise inactivating the phytotoxin. It 
also is contemplated that expression of genes that alter the interactions 

25 between the host plant and pathogen may be useful in reducing the ability of 
the disease organism to invade the tissues of the host plant, e.g.. an 
increase in the waxiness of the leaf cuticle or other morphological 
characteristics. 

d. Environment or stress resistance 
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Improvement of a plant's ability to tolerate various environmental 
stresses such as, but not limited to, drought, excess moisture, chilling, 
freezing, high temperature, salt, and oxidative stress, also can be effected 
through expression of genes therein. It is proposed that benefits may be 
5 realized in terms of increased resistance to freezing temperatures through the 
introduction of an "antifreeze" protein such as that of the Winter Flounder 
(Cutler era/., 1989) or synthetic gene derivatives thereof. Improved chilling 
tolerance also may be conferred through increased expression of glycerol-3- 
phosphate acetyltransferase in chloroplasts (Wolter era/., 1992). Resistance 

10 to oxidative stress in some crop species (often exacerbated by conditions 
such as chilling temperatures in combination with high light intensities) can 
be conferred by expression of superoxide dismutase (Gupta era/., 1993), 
and may be improved by glutathione reductase (Bowler et al., 1992). Such 
strategies may allow for tolerance to freezing in newly emerged fields as well 

15 as extending later maturity higher yielding varieties to earlier relative maturity 
zones. 

It is contemplated that the expression of genes that favorably effect 
plant water content, total water potential, osmotic potential, and turgor will 
enhance the ability of the plant to tolerate drought. As used herein, the 

20 terms "drought resistance" and drought tolerance" are used to refer to a 

plant's increased resistance or tolerance to stress induced by a reduction in 
water availability, as compared to normal circumstances, and the ability of 
the plant to function and survive in lower-water environments. The 
expression of genes encoding for the biosynthesis of osmotically-active 

25 solutes, such as polyol compounds, may impart protection against drought. 
Within this class are genes encoding for mannitol-L-phosphate 
dehydrogenase (Lee and Saier, 1982) and trehalose-6-phosphate synthase 
(Kaasen et al., 1992). Through the subsequent action of native 
phosphatases in the cell or by the introduction and coexpression of a specific 

30 phosphatase, these introduced genes will result in the accumulation of either 
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mannitol or trehalose, respectively, both of which have been well 
documented as protective compounds able to mitigate the effects of stress. 
Mannitol accumulation in transgenic tobacco has been verified and 
preliminary results indicate that plants expressing high levels of this 
5 metabolite are able to tolerate an applied osmotic stress (Tarczynski eta/., 
1992, 1993). 

Similarly, the efficacy of other metabolites in protecting either enzyme 
function (e.g., alanopine or propionic acid) or membrane integrity (e.g., 
alanopine) has been documented (Loomis eta/., 1989), and therefore 

10 expression of genes encoding for the biosynthesis of these compounds might 
confer drought resistance in a manner similar to or complimentary to 
mannitol. Other examples of naturally occurring matabolites that are 
osmotically active and/or provide some direct protective effect during 
drought and/or desiccation include fructose, erythritol (Coxson eta/., 1992), 

15 sorbitol, dulcitol (Karsten et a/., 1 992), glucosylglycerol (Reed era/., 1984; 
ErdMann et a/., 1992), sucrose, stachyose (Koster and Leopold, 1988: 
Blackman eta/., 1992), raffinose (Bernal-Lugo and Leopold, 1992), proline 
(Rensburg eta/., 1993), glycine betaine, ononitol and pinitol (Vernon and 
Bohnert, 1992). Continued canopy growth and increased reproductive 

20 fitness during times of stress will be augmented by introduction and 
expression of genes such as those controlling the osmotically active 
compounds discussed above and other such compounds. Genes which 
promote the synthesis of an osmotically active polyol compound include 
genes which encode the enzymes mannitol- 1 -phosphate dehydrogenase, 

25 trehalose-6-phosphate synthase and myoinositol O-methyltransferase. 

Artificial chromosomes can carry a multiplicity of genes to provide durable 
stress tolerance, for example, concominant expression of proline and ketane 
and/or poly-ols. 

It is contemplated that the expression of specific proteins also may 
30 increase drought tolerance under certain conditions or in certain crop 
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species. These may include proteins such as Late Embryogenic Proteins (see 
Dure et aL, 1989). All three classes of LEAs have been demonstrated in 
maturing (he. desiccating) seeds. Within LEA proteins, the Type-fl (dehydrin- 
type) have generally been implicated in drought and/or desiccation tolerance 
5 in vegetative plant parts {i.e. Mundy and Chua, 1988: Piatkowski et aL, 

1990: Yamaguchi-Shinozaki eta/., 1992). Recently, expression of a Type-Ill 
LEA (HVA-1) in tobacco was found to influence plant height, maturity and 
drought tolerance (Fitzpatrick, 1993). In rice, expression of the HVA-1 gene 
influenced tolerance to water deficit and salinity (Xu et al 1996). 

10 Expression of structural genes from all three LEA groups may therefore 
confer drought tolerance. Other types of proteins induced during water 
stress include thiol proteases, aldolases and transmembrane transporters 
(Guerrero et aL, 1999), which may confer various protective and/or repair- 
type functions during drought stress. It is also is contemplated that genes 

15 that effect lipid biosynthesis and hence membrane composition might also be 
useful in conferring drought resistance on the plant. 

Many of these genes for improving drought resistance have 
complementary modes of action. Thus, combinations of these genes might 
have additive and/or synergistic effects in improving drought resistance in 

20 plants. Many of these genes also improve freezing tolerance (or resistance): 
the physical stresses incurred during freezing and drought are similar in 
nature and may be mitigated in similar fashion. Benefit may be conferred via 
constitutive expression of these genes, but the preferred means of 
expressing these genes may be through the use of a turgor-induced promoter 

25 (such as the promoters for the turgor-induced genes described in Guerrero et 
aL, 1990 and Shagan et aL, 1993 which are incorporated herein by 
reference). Spatial and temporal expression patterns of these genes may 
enable plants to better withstand stress. 

It is proposed that expression of genes that are involved with specific 

30 morphological traits that allow for increased water extractions from drying 
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is possible for as few as 50 clones to represent the entire micro- 
megachromosome. 

a. Centromeres 
An exemplary centromere for use in the construction of an artificial 
5 chromosome is that contained within a megachromosome, such as those 
described herein. One example of a particular megachromosome-containing 
cell line provided is, for example, H1D3 and derivatives thereof, such as 
mM2C1 cells. Megachromosomes are isolated from such cell lines utilizing, 
for example, the procedures described herein, and the centromeric sequence 

10 is extracted from the isolated megachromosomes. For example, the 
megachromosomes may be separated into fragments utilizing selected 
restriction endonucleases that recognize and cut at sites that, for instance, 
are primarily located in the replication and/or heterologous DNA integration 
sites and/or in the satellite DNA. Based on the sizes of the resulting 

15 fragments, certain undesired elements may be separated from the 

centromere-containing sequences. The centromere-containing DNA could be 
as large as 1 Mb. 

Probes that specifically recognize centromeric sequences, such as 
mouse minor satellite DNA-based probes [see, e.g. , Wong et ah (1988) Nucl. 

20 Acids Res. 16 :11645-116611. pCT4.2 probe, a 3.5 kb fragment of 
Arabidopsis 5S rDNA (Campbell et al. (1992) Gene 1 1 2:225-228), 
Arabidopsis cosmids E4.11 (30kb) adn E4.6 (33 kb, Bent et al. (1994) 
Science 265:1856-1860; and 180 bp pALI repeat sequence (Maluszynska et 
al. (1991) Plant J. 7: 159-166; and Martinez-Zapater et al. (1986) Mol. Gen. 

25 Genet. 204:417-423) may be used to isolate a centromere-containing YAC, 
BAC or PAC clone derived from the megachromosome. Alternatively, or in 
conjunction with the direct identification of centromere-containing 
megachromosomal DNA, probes that specifically recognize the non- 
centromeric elements, such as probes specific for mouse major satellite DNA, 
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soil would be of benefit. For example, introduction and expression of genes 
that alter root characteristics may enhance water uptake. It also is 
contemplated that expression of genes that enhance reproductive fitness 
during times of stress would be of significant value. For example, expression 
5 of genes that improve the synchrony of pollen shed and receptiveness of the 
female flower parts, i.e., silks, would be of benefit. In addition it is 
proposed that expression of genes that minimize kernel abortion during times 
of stress would increase the amount of grain to be harvested and hence be 
of value. 

10 Given the overall role of water in determining yield, it is contemplated 

that enabling plants to utilize water more efficiently, through the introduction 
and expression of genes, will improve overall performance even when soil 
water availability is not limiting. By introducing genes that improve the 
ability of plants to maximize water usage across a full range of stresses 

15 relating to water availability, yield stability or consistency of yield 
performance may be realized. 

e. Plant agronomic characteristics 
Plants possessing desired traits that might, for example, enhance 
utility, processibility and commercial value of the organisms in areas such as 

20 the agricultural and ornamental plant industries may also be generated using 
artificial chromosomes in the same manner as described above for production 
of disease-resistant organisms. In such instances, the artificial chromosomes 
that are introduced into the organism or embryo contain DNA encoding gene 
products that serve to confer the desired trait in the organism. 

25 For example, transgenic plants having improved flavor properties, 

stability and/or quality are of commercial interest. One possible method for 
generating such plants may include the expression of transgenes, e.g., genes 
encoding cystathionine gamma synthase (CGS), that result in increased free 
methionine levels (see, e.g., PCT Application publication no. WO 00/55303). 
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Two of the factors determining where crop plants can be grown are 
the average daily temperature during the growing season and the length of 
time between frosts. Within the areas where it is possible to grow a 
particular crop, there are varying limitations on the maximal time it is allowed 
5 to grow to maturity and be harvested. For example, a variety to be grown in 
a particular area is selected for its ability to mature and dry down to 
harvestable moisture content within the required period of time with 
maximum possible yield. Therefore, crops of varying maturities are 
developed for different growing locations. Apart from the need to dry down 

lO sufficiently to permit harvest, it is desirable to have maximal drying take 
place in the field to minimize the amount of energy required for additional 
drying post-harvest. Also, the more readily a product such as grain can dry 
down, the more time there is available for growth and kernel fill. Genes that 
influence maturity and/or dry down can be identified and introduced into 

15 plant lines using transformation techniques to create new varieties adapted 
to different growing locations or the same growing location, but having 
improved yield to moisture ratio at harvest. Expression of genes that are 
involved in regulation of plant development may be especially useful. 
Genes that would improve standability and other plant growth 

20 characteristics may also be introduced into plants. Expression of new genes 
in plants which confer stronger stalks, improved root systems, or prevent or 
reduce ear droppage would be of great value to the farmer. Introduction and 
expression of genes that increase the total amount of photoassimilate 
available by, for example, increasing light distribution and/or interception 

25 would be advantageous. In addition, the expression of genes that increase 
the efficiency of photosynthesis and/or the leaf canopy would further 
increase gains in productivity. Expression of a phytochrome gene in crop 
plants may be advantageous. Expression of such a gene may be reduce 
apical dominance, confer semidwarfism on a plant, and increase shade 
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tolerance (U.S. Patent No. 5,268,526). Such approaches would allow for 
increased plant populations in the field. 

f . Nutrient utilization 

The ability to utilize available nutrients may be a limiting factor in 
5 growth of crop plants. It may be possible to alter nutrient uptake, tolerate 
pH extremes, mobilization through the plant, storage pools, and availability 
for metabolic activities by the introduction of new agents. These 
modifications would allow a plant such as maize to more efficiently utilize 
available nutrients. An increase in the activity of, for example, an enzyme 

10 that is normally present in the plant and involved in nutrient utilization may 
increase the availability of a nutrient. An example of such an enzyme would 
be phytase. It is further contemplated that enhanced nitrogen utilization by a 
plant is desirable. Expression of a glutamate dehydrogenase gene in plants, 
e.g., E. co/i gdhA genes, may lead to enhanced resistance to the herbicide 

15 glufosinate by incorporation of excess ammonia into glutamate, thereby 
detoxifying the ammonia. Gene expression may make a nutrient source 
available that was previously not accessible, e.g. , an enzyme that releases a 
component of nutrient value from a more complex molecule, perhaps a 
macromolecule. Alternatively, artificial chromosomes can carry the 

20 multiplicity of genes governing nodulation and nitrogen fixation in legumes. 
The artificial chromosomes could be used to promote nodulation in non- 
legume species. 

g. Male sterility 

Male sterility is useful in the production of hybrid seed. Male sterility 
25 may be produced through gene expression. For example, it has been shown 
that expression of genes that encode proteins that interfere with 
development of the male inflorescence and/or gametophyte result in male 
sterility. Chimeric ribonuclease genes that express in the anthers of 
transgenic tobacco and oilseed rape have been demonstrated to lead to male 
30 sterility (Mariani era/., 1990). Other methods of conferring male sterility 
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have been described, including gene encoding antisense RNA capable of 
causing male sterility (U.S. Patent Nos. 6,184,439, 6,191,343 and 
5,728,926) and methods utilizing two genes to confer sterility, see, e.g. , 
U.S. Patent No. 5,426,041. 
5 A number of mutations were discovered in maize that confer 

cytoplasmic male sterility. One mutation in particular, referred to as T 
cytoplasm, also correlates with sensitivity to Southern corn leaf blight. A 
DNA sequence, designated TURF-13 (Levings, 1990), was identified that 
correlates with T cytoplasm. It is proposed that it would be possible through 

10 the introduction of TURF-13 via transformation, to separate male sterility 

from disease sensitivity. As it is necessary to be able to restore male fertility 
for breeding purposes and for grain production, it is proposed that genes 
encoding restoration of male fertility also may be introduced, 
h. Improved nutritional content 

15 Genes may be introduced into plants to improve the nutrient quality or 

content of a particular crop. Introduction of genes that alter the nutrient 
composition of a crop may greatly enhance the feed or food value. For 
example, the protein of many grains is suboptimal for feed and food purposes 
especially when fed to pigs, poultry, and humans. The protein is deficient in 

20 several amino acids that are essential in the diet of these species, requiring 
the addition of supplements to the grain. Limiting essential amino acids may 
include lysine, methionine, tryptophan, threonine, valine, arginine, and 
histidine. Some amino acids become limiting only after corn is supplemented 
with other inputs for feed formulations. The levels of these essential amino 

25 acids in seeds and grain may be elevated by mechanisms which include, but 
are not limited to, the introduction of genes to increase the biosynthesis of 
the amino acids, increase the storage of the amino acids in proteins, or 
increase transport of the amino acids to the seeds or grain. 

The protein composition of a crop may be altered to improve the 

30 balance of amino acids in a variety of ways including elevating expression of 
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native proteins, decreasing expression of those with poor composition 
changing the composition of native proteins, or introducing genes encoding 
entirely new proteins possessing superior composition. 

The introduction of genes that alter the oil content of a crop plant may 
5 also be of value. Increases in oil content may result in increases in 

metabolizable-energy-content and density of seeds for use in feed and food. 
The introduced genes may encode enzymes that remove or reduce rate- 
limitations or regulated steps in fatty acid or lipid biosynthesis. Such genes 
may include, but are not limited to, those that encode acetyl-CoA 

lO carboxylase, ACP-acyltransf erase, /?-ketoacyl-ACP synthase, plus other well 
known fatty acid biosynthetic activities. Other possibilities are genes that 
encode proteins that do not possess enzymatic activity such as acyl-carrier 
proteins. Genes may be introduced that alter the balance of fatty acids 
present in the oil providing a more healthful or nutritive feedstuff. The 

15 introduced DNA also may encode sequences that block expression of 

enzymes involved in fatty acid biosynthesis, altering the proportions of fatty 
acids present in crops. 

Genes may be introduced that enhance the nutritive value of the 
starch component of crops, for example by increasing, or in some cases 

20 decreasing, the degree of branching, resulting in improved utilization of the 
starch in livestock by delaying its metabolism. Additionally, other major 
constituents of a crop may be altered, including genes that affect a variety of 
other nutritive, processing, or other quality aspects. For example, 
pigmentation may be increased or decreased. 

25 Feed or food crops may also possesses insufficient quantities of 

vitamins, requiring supplementation to provide adequate nutritive value. 
Introduction of genes that enhance vitamins biosynthesis may be envisioned 
including, for example, vitamins A (e.g. rice with Vitamin A or golden rice), 
E, B12 choline, and the like. Mineral content may also be sub-optimal. Thus 

30 genes that affect the accumulation or availability of compounds containing 
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phosphorus, sulfur, calcium, manganese, zinc, and iron among others would 
be valuable. 

Mumerous other examples of improvements of crops may be effected 
using the artificial chromosomes, with appropriate heterologous genes 
5 contained therein, in accordance with the methods and compositions 

provided herein. The improvements may not necessarily involve grain, but 
may, for example, improve the value of a crop for silage. Introduction of 
DNA to accomplish this might include sequences that alter lignin production 
such as those that result in the "brown midrib" phenotype associated with 

10 superior feed value for cattle. 

In addition to direct improvements in feed or food value, genes also 
may be introduced which improve the processing of crops and improve the 
value of the products resulting from the processing. One use of crops is via 
wetmilling. Thus, genes that increase the efficiency and reduce the cost of 

15 such processing, for example, by decreasing steeping time may also find use. 
Improving the value of wetmilling products may include altering the quantity 
or quality of starch, oil, corn gluten meal, or the components of gluten feed. 
Elevation of starch may be achieved through the identification and 
elimination of rate limiting steps in starch biosynthesis or by decreasing 

20 levels of the other components of crops resulting in proportional increases in 
starch. 

Oil is another product of wetmilling, the value of which may be 
improved by introduction and expression of genes. Oil properties maybe be 
altered to improve its performance in the production and use of cooking oil, 

25 shortenings, lubricants or other oil-derived products or improvements of its 

health attributes when used in the food-related applications. Fatty acids also 
may be synthesized which upon extraction can serve as starting materials for 
chemical syntheses. The changes in oil properties may be achieved by 
altering the type, level, or lipid arrangement of the fatty acids present in the 

30 oil. This in turn may be accomplished by the addition of genes that encode 
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enzymes that catalyze the synthesis of new fatty acids and the lipids 
possessing them or by increasing levels of native fatty acids while possibly 
reducing levels of precursors. Alternatively, DNA sequences may be 
introduced which slow or block steps in fatty acid biosynthesis resulting in 
5 the increase in precursor fatty acid intermediates. Genes that might be 

added include desaturases, epoxidases, hydratases, dehydratases and other 
enzymes that catalyze reactions involving fatty acid intermediates. 
Representative examples of catalytic steps that might be blocked include the 
desaturations from stearic to oleic acid and oleic to linolenic acid resulting in 

10 the respective accumulations of stearic and oleic acids. Another example is 
the blockage of elongation steps resulting in the accumulation of C8 to CI 2 
saturated fatty acids. 

i. Production of chemicals or biologicals 
Transgenic plants can be used as protein production systems to 

1 5 generate recombinant products ranging from industrial enzymes, viral 

antigens, vaccines, antibodies, human blood proteins, cytokines, growth 
factors, enkephalins, serum albumin and other proteins of clinical relevance 
and pharmaceuticals. For example, enzymes including a-amylase # glucanase, 
phytase and xylanase (see, Goddijn and Pen (1995) Trends Biotechnol. 

20 73:379-387; Pen et at. (1992) Bio/Technology 70:292-296; Horvath eta/. 
(2000) Proc. Natl. Acad. ScL U.S.A. 97: 1 91 4-1 91 9; and e.g., Herbers and 
Sonnewald (1996) in Transgenic Plants: A Production System for industrial 
and Pharmaceutical Proteins" Owen and Pen Eds., John Wiley & Sons, West 
Sussex, England). 

25 Examples of medically relevant proteins that may be produced in 

plants include surface antigens of viral pathogens, such as hepatitis B virus 
and transmissible gastroenteritis virus spike protein, for use in vaccines. The 
proteins thus produced may be isolated and administered through standard 
vaccine introduction methods or through the consumption of the edible 

30 transgenic plant as food which can be taken orally (see, e.g. , U.S. Patent No. 
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6,136,320 and Mason et al. (1992) Proc, Natl. Acad. Sci. U.S.A. 89A 1745- 
11749). HIV, rhinovirus, malarial and rabies virus antigens are additional 
examples of that may be expressed in plants as candidate vaccines (see, 
e.g., Porta et al. (1994) Virol. 202:949-955; Turpen et al. (1995) 
5 Bio/Technology 73:53-57; and McGarvey et al. (1995) Bio/Technology 

13: 1484-1 487). Antibodies may also be produced in plants, including, for 
example, a gene fusion encoding an antigen-binding single chain Fv protein 
(scFv) that recognizes the hapten oxazolone (Fiedler and Conrad (1995) 
Bio/Technology 73:1090-1093) and IgG (Ma et al. (1995) Science 268.11G- 
10 719). 

Examples of human biopharmaceuticals that may be expressed in 
plants include, but are not limited to, albumin (Sijmons et al. (1990)), 
enkephalins (Vandekerckhove et al. (1989) ), interferon-or (Zhu et al. (1994) 
and GM-CSF (Ganz et al. (1996) in Transgenic Plants: A Production System 

15 for Industrial and Pharmaceutical Proteins, Owen and Pen Eds., John Wiley & 
Sons, West Sussex, England, pp. 281-297; and Sardana et al. (1998) in 
Methods in Biotechnology, Vol. 3: Recombinant Proteins from Plants: 
Production and Isolation of Clinically Useful Compounds, Cunningham and 
Porter, Eds., Humana Press, New Jersey; pp. 77-87). 

20 Transgenic plants producing these compounds are made possible by 

the introduction and expression of one or potentially many genes using the 
artificial chromosomes provided herein. The vast array of possibilities 
include, but are not limited to, any biological compound which is presently 
produced by any organism such as proteins, nucleic acids, primary and 

25 intermediary metabolites, carbohydrate polymers, enzymes for uses in 

bioremediation, enzymes for modifying pathways that produce secondary 
plant metabolites such as flavonoids or vitamins, enzymes that could produce 
pharmaceuticals and for introducing enzymes that could produce compounds 
of interest to the manufacturing industry such as specialty chemicals and 

30 plastics. The compounds may be produced by the plant, extracted upon 




WO 2002/096923 PCT/US2002/0 17451 



-154- 

harvest and/or processing, and used for any presently recognized useful 
purpose such as pharmaceuticals, fragrances, and industrial enzymes to 
name a few. Alternatively, plants produced in accordance with the methods 
and compositions provided herein may be made to metabolize certain 
5 compounds, such as hazardous wastes, thereby allowing bioremediation of 
these compounds. 

j. Non-protein-expressing sequences 

Nucleic acids may be introduced into plants that are designed to 
down-regulate or supress a plant-encoded gene. A number of different means 

10 to achieve down regulation have been demonstrated in the art, including 

antisense RNA, ribozymes and co-suppression. The use of antisense RNA to 
suppress plant genes is described, for example, in U.S. Patent Nos. 
4,801,540, 5,107,065 and 5,453,566. In such methods, an "antisense" 
gene is constructed that encodes an RNA that is complementary to the 

15 mRNA of a resident plant gene, such that expression of the antisense gene 
inhibits the translation of the mRNA of the resident plant gene. Thus, the 
activity of the resident gene is down-regulated. 

An additional method of down regulating gene activities involves 
ribozymes, or catalytic hammerhead hairpin RNA structures. The use of 

20 ribozymes is described, for example, in U.S. Patent Nos. 4,987,071, 
5,037,746, 5,1 16,742 and 5,354,855. These methods rely on the 
expression of small catalytic "hammerhead" RNA molecules that are capable 
of binding to and cleaving specific RNA sequences. Ribozymes designed to 
specifically recognize a resident plant mRNA can be used to cleave the 

25 mRNA and prevent its proper expression. 

Essentially a more or less equivalent down-regulation control of gene 
activities by ribozymes and antisense can be achieved by adding additional 
copies of the gene to be regulated. The process is referred to as co- 
suppression and is described in, for example, U.S. Patent Nos. 5,034,323, 

30 5,283,184 and 5,231,020. 
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Numerous plant genes may be targeted for down regulation. For 
example, a gene may be down-regulated that encodes an enzyme that 
catalyzes a reaction in a plant. Reduction of the enzyme activity may reduce 
or eliminate products of the reaction which include any enzymatically 
5 synthesized compound in the plnat such as fatty acids, amino acids, 

carbohydrates, nucleic acids and the like. Alternatively, the protein may be a 
storage protein, such as zein, or a structural protein, the decreased 
expression of which may lead to changes in seed amino acid composition or 
plant morphological changes, respectively. The possibilities cited above are 
10 provided only by way of example and do not represent the full range of 
applications. 

(1). Antisense RNA 

Genes may be constructed, which when transcribed, produce 
antisense RNA that is complementary to all or part(s) of a targeted 

15 messenger RNA(s). The antisense RNA reduces production of the 

polypeptide product of the messenger RNA. The polypeptide product may be 
any protein encoded by the plant genome. The aforementioned genes will be 
referred to as antisense genes. An antisense gene may thus be introduced 
into a plant by transformation methods to produce a transgenic plant with 

20 reduced expression of a selected protein of interest. For example, the 

protein may be an enzyme that catalyzes a reaction in the plant. Reduction 
of the enzyme activity may reduce or eliminate products of the reaction 
which include any enzymatically synthesized compound in the plant such as 
fatty acids, amino acids, carbohydrates, nucleic acids and the like. 

25 Alternatively, the protein may be a storage protein, such as a zein, or a 

structural protein, the decreased expression of which may lead to changes in 
seed amino acid composition or plant morphological changes respectively. 
The possibilities cited above are provided only by way of example and do not 
represent the full range of applications. 

30 (2.) Ribozymes 
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Genes also may be constructed or isolated, which when transcribed, 
produce RNA enzymes (ribozymes) which can act as endoribonucleases and 
catalyze the cleavage of RNA molecules with selected sequences. The 
cleavage of selected messenger RIMAs can result in the reduced production of 
5 their encoded polypeptide products. These genes may be used to prepare 
transgenic plants which possess them. The transgenic plants may possess 
reduced levels of polypeptides including, but not limited to, the polypeptides 
cited above. 

Ribozymes are RNA-protein complexes that cleave nucleic acids in a 
10 site-specific fashion. Ribozymes have specific catalytic domains that 

possess endonuclease activity (Kim and Cech, 1987; Gerlach et a/., 1987; 
Forster and Symons, 1987). For example, a large number of ribozymes 
accelerate phosphoester transfer reactions with a high degree of specificity, 
often cleaving only one of several phophoesters in an oligonucleotide 
15 substrate (Cech eta/., 1981; Michel and Westhof, 1990); Reinhold-Hurek 
and Shub, 1992). This specificity has been attributed to the requirement 
that the substrate bind via specific base-pairing interactions to the internal 
guide sequence ("IGS") of the ribozyme prior to chemical reaction. 

Ribozyme catalysis has primarily been observed as part of sequence- 
20 specific cleavage/ligation reactions involving nucleic acids (Joyce, 1 989; 

Cech era/., 1981). For example, U.S. Patent 5,354,855 reports that certain 
ribozymes can act as endonucleases with a sequence specificity greater than 
that of known ribonucleases and approaching that of the DNA restriction 
enzymes. 

25 Several different ribozyme motifs have been described with RNA 

cleavage activity (Symons, 1992). Examples include sequences from the 
Group I self splicing introns including Tobacco Ringspot Virus (Prody eta/., 
1986), Avacado Sunblotch Viroid (Palukaitis et aL, 1979; Symons, 1981) 
and Lucerne Transient Streak Virus (Forster and Symons, 1987). Sequences 
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f rom these and related viruses are referred to as hammerhead ribozyme 
based on a predicted folded secondary structure. 

Other suitable ribozymes include sequences from RNase P with RNA 
cleavage activity (Yuan eta/., 1992; Yuan and Altman, 1994; U.S. Patents 
5 5,168,053 and 5,624,824), hairpin ribozyme structures (Berzal-Herranz et 
aL, 1992; Chowrira et aL, 1993) and Hepatitis Delta virus based ribozymes 
(U.S. Patent 5,625,047). The general design and optimization of ribozyme 
directed RNA cleavage activity has been discussed in detail (Haselhoff and 
Gerlach, 1988; Symons, 1992; Chowrira era/., 1994; Thompson era/., 
10 1995). 

The other variable on ribozyme design is the selection of a cleavage 
site on a given target RNA. Ribozymes are targeted to a given sequence by 
virtue of annealing to a site by complementary base pair interactions. Two 
stretches of homology are required for this targeting. These stretches of 

15 homologous sequences flank the catalytic ribozyme structure defined above. 
Each stretch of homologous sequence can vary in length from 7 to 1 5 
nucleotides. The only requirement for defining the homologous sequences is 
that, on the target RNA, they are separated by a specific sequence which is 
the cleavage site. For hammerhead ribozyme, the cleavage site is a 

20 dinucleotide sequence on the target RNA is a uracil (U) followed by either an 
adenine, cytosine or uracil (A, C or U) (Perriman et aL, 1992; Thompson et 
a/.. 1995). The frequency of this dinucleotide occurring in any given RNA is 
statistically 3 out of 16. Therefore, for a given target messenger RNA of 
1,000 bases, 187 dinucleotide cleavage sites are statistically possible. 

25 Designing and testing ribozymes for efficient cleavage of a target RNA 

is a process well known to those skilled in the art. Examples of scientific 
methods for designing and testing ribozymes are described by Chowrira et aL 
(1994) and Lieber and Strauss (1995), each incorporated by reference. The 
identification of operative and preferred sequences for use in down regulating 

30 a given gene is simply a matter of preparing and testing a given sequence. 
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and is a routinely practiced "screening" method known to those of skill in the 
art. 

(3.) Induction of gene silencing 

It also is possible that genes may be introduced to produce transgenic 
5 plants which have reduced expression of a native gene product by the 

mechanism of co-suppression. It has been demonstrated in tobacco, tomato, 
and petunia (Goring et aL. 1991; Smith etaL, 1990; Napoli et aL, 1990; van 
der Krol et a/., 1990) that expression of the sense transcript of a native gene 
will reduce or eliminate expression of the native gene in a manner similar to 

1 0 that observed for antisense genes. The introduced gene may encode all or 
part of the targeting native protein but its translation may not be required for 
reduction of levels of that native protein. 

(4.) Non-RNA-expressing sequences 
DNA elements including those of transposable elements such as Ds, 

15 Ac, or ML), may be inserted into a gene to cause mutations. These DNA 
elements may be inserted in order to inactivate (or activate) a gene and 
thereby "tag" a particular trait. In this instance the transposable element 
does not cause instability of the tagged mutation, because the utility of the 
element does not depend on its ability to move in the genome. Once a 

20 desired trait is tagged, the introduced DNA sequence may be used to clone 
the corresponding gene, e.g. , using the introduced DNA sequence as a PCR 
primer together with PCR gene cloning techniques (Shapiro, 1 983; Dellaporta 
etaL, 1988). Once identified, the entire gene(s) for the particular trait, 
including control or regulatory regions where desired, may be isolated, cloned 

25 and manipulated as desired. The utility of DNA elements introduced into an 
organism for purposes of gene tagging is independent of the DNA sequence 
and does not depend on any biological activity of the DNA sequence, i.e., 
transcription into RNA or translation into protein. The sole function of the 
DNA element is to disrupt the DNA sequence of a gene. 
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It is contemplated that unexpressed DNA sequences, including 
synthetic sequences, could be introduced into cells as proprietary "labels" of 
those cells and plants and seeds thereof. It would not be necessary for a 
label DNA element to disrupt the function of a gene endogenous to the host 
5 organism, as the sole function of this DNA would be to identify the origin of 
the organism. For example, one could introduce a unique DNA sequence into 
a plant and this DNA element would identify all cells, plants, and progeny of 
these cells as having arisen from that labeled source. It is proposed that 
inclusion of label DNAs would enable one to distinguish proprietary 

10 germplasm or germplasm derived from such, from unlabelled germplasm. 

Another possible element which may be introduced is a matrix 
attachment region element (MAR), such as the chicken lysozyme A element 
(Stief, 1989), which can be positioned around an expressible gene of interest 
to effect an increase in overall expression of the gene and diminish position 

15 dependent effects upon incorporation into the plant genome (Stief era/., 

1989; Phi-Van et aL, 1990). Sequences such as MARs can be included on 

the artificial chromosome to enhance gene expression. 

3. Transgenic models for evaluation of genes and discovery of 
new traits 

20 °f significant interest is the use of plants and plant cells containing 

artificial chromosomes for the evaluation of new genetic combinations and 
discovery of new traits. Artificial chromosomes, by virtue of the fact that 
they can contain significant amounts of DNA can also therefore encode 
numerous genes and accordingly a multiplicity of traits. It is contemplated 

25 here that artificial chromosomes, when formed from one plant species, can 
be evaluated in a second plant species. The resultant phenotypic changes 
observed, for example, can indicate the nature of the genes contained within 
the DNA containing the artificial chromosome, and hence permit the 
identification of new genetic activities. Artificial chromsomes containing 

30 euchromatic DNA or partially containing euchromatic DNA can serve as a 
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valuable source of new traits when transferred to an alien plant cell 
environment. For example, it is contemplated that artificial chromosomes 
derived from dicot plant species can be introduced into monocot plant 
species by transfering a dicot artificial chromosome. The dicot artificial 
5 chromosome containing a region of euchromatic DNA containing expressed 
genes. 

The artificial chromosomes can be generated or manipulated in such a 
fashion that a large region of naturally occurring plant DNA becomes 
incorporated into the artificial chromosome. This allows the artificial 

10 chromosome to contain new genetic activities and hence carry new traits. 
For example, an artificial chromosome can be introduced into a wild relative 
of a crop plant under conditions whereby a portion of the DNA present in the 
chromosomes of the wild relative is transferred to the artificial chromosome. 
After isolation of the artificial chromosome, this naturally occurring region of 

15 DNA from the wild relative, now located on the artificial chromosome can be 
introduced into the domesticated crop species and the genes encoded within 
the transferred DNA expressed and evaluated for utility. New traits and gene 
systems can be discovered in this fashion. 

Artificial chromosomes modified to recombine with plant DNA offer 

20 many advantages for the discovery and evaluation of traits in different plant 
species. When the artificial chromosome containing DNA from one plant 
species is introduced into a new plant species, new traits and genes can be 
introduced. This use of an artificial chromosome allows for the ability to 
overcome the sexual barrier that prevents transfer of genes from one plant 

25 species to another species. Using artificial chromosomes in this fashion 

allows for many potentially valuable traits to be identified including traits that 
are typically found in wild species. Other valuable applications for artificial 
chromosomes include the ability to transfer large regions of DNA from one 
plant species to another, DNA encoding potentially valuable traits such as 

30 altered oil, carbohydrate or protein composition, multiple genes encoding 



WO 2002/096923 



PCT/US2002/017451 



-161- 

enzymes capable of producing valuable plant secondary metabolites, genetic 
systems encoding valuable agronomic traits such as disease and insect 
resistance, genes encoding functions that allow association with soil 
bacterium such as growth promoting bacteria or nitrogen fixing bacteria, or 
5 genes encoding traits that confer freezing, drought or other stress tolerances. 
In this fashion, artificial chromosomes can be used to discover regions of 
plant DNA that encode valuable traits. 

The artificial chromosome can also be designed to allow the transfer 
and subsequent incorporation of these valuable traits now located on the 

10 artificial chromosome into the natural chromosomes of a plant species. In 
this fashion the artificial chromosomes can be used to transfer large regions 
of DNA encoding traits normally found in one plant species into another plant 
species. In this fashion, it is possible to derive a plant cell that no longer 
needs to carry an artificial chromosome to posses the new trait. Thus the 

15 artificial chromosome would serve as the transfer mechanism to permit the 
formation of plants with greater degree of genetic diversity. 

An artificial chromosome can be designed in a variety of ways to 
accomplish the afore-mentioned purposes. An artificial chromosome can be 
modified to contain sequences that promote homologous recombination 

20 within plant cells, or be modified to contain a genetic system that functions 
as a site-specific recombination system. For example, the DNA sequence of 
Arabidopsis is now known. To construct an artificial chromosome capable of 
recombining with a specific region of Arabidopsis DNA, a sequence of 
Arabidopsis DNA, normally located near a chromosomal location encoding 

25 genes of potential interest can be introduced into an artificial chromosome by 
methods provided herein. It may be desireable to include a second region of 
DNA within the artificial chromosome that provides a second flanking 
sequence to the region encoding genes of potential interest, to promote a 
double recombination event which would ensure transfer of the entire 

30 chromosomal region encoding genes of potential interest to the artificial 
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chromosome. The modified artificial chromosome, containing the DNA 
sequences capable of homologous recombination region can then be 
introduced into Arabidopsis cells and the homologous recombination event is 
selected. 

5 It is convenient to include a marker gene to allow for the selection of a 

homologous recombination event. The marker gene is preferably inactive 
unless activated by an appropriate homologous recombination event. For 
example, US 5,272,071, describes a method where an inactive plant gene is 
activated by a recombination event such that desired homologous 

10 recombination events can be easily scored. Similarly, US 5,501,967 

describes a method for the selection of homologous recombination events by 
activation of a silent selection gene first introduced into the plant DNA, the 
gene being activated by an appropriate homologous recombination event. 
Both of these methods can be applied to enable a selective process to be 

15 included in to select for recombination between an artificial chromosome and 
a plant chromosome. Once the homologous recombination event is 
detected, the artificial chromosome, once selected, is isolated and introduced 
into a recipient cell, for example, tobacco, corn, wheat or rice, and the 
expression of the newly introduced DNA sequences evaluated. Selection of 

20 recombinant events can take place in cell culture, or following seed formation 
and screening of seedling plants or seed itself. 

Phenotypic changes in the recipient plant cells containing the artificial 
chromosome, or in regenerated plants containing the artificial chromosome, 
allows for the evaluation of the nature of the traits encoded by the genes of 

25 interest, for example, Arabidopsis DNA, under conditions naturally found in 
plant cells, including the naturally occurring arrangement of DNA sequences 
responsible for the developmental control of the traits in the normal 
chromosomal environment. 

Traits such as durable fungal or bacterial disease resistance, new oil and 

30 carbohydrate compositions, valuable secondary metabolites such as 
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phytosterols, flavonoids, efficient nitrogen fixation or mineral utilization, 
resistance to extremes of drought, heat or cold are all found within different 
populations of plant species and are often governed by multiple genes. The use 
of single gene transformation technologies does not permit the evaluation of the 
5 multiplicity of genes controlling many valuable traits. Thus, incorporation of 
these genes into artificial chromosomes allows the rapid evaluation of the utility 
of these genetic combinations in heterologous plant species. 

The large scale order and structure of the artificial chromosome provides 
a number of unique advantages in screening for new utilities or new phenotypes 

10 within heterologous plant species. The size of new DNA that can be carried by 
an artificial chromosome can be millions of base pairs of DNA, representing 
potentially numerous genes that may have different or new utility in a 
heterologous plant cell. The artificial chromosome is a "natural" environment 
for gene expression, the problems of variable gene expression and silencing 

15 seen for genes transferred by random insertion into a genome should not be 
observed. Similarly, there is no need to engineer the genes for expression, and 
the genes inserted would not need to be recombinant genes. Thus, transferred 
genes are fully expected to be expressed in the typical temporal and spatial 
fashion as observed in the species from where the genes were initially isolated. 

20 A valuable feature for these utilities is the ability to isolate the artificial 
chromosomes and to further isolate, manipulate and introduce into other cells 
artificial chromosomes carrying unique genetic compositions. 

Thus, the use of artificial chromosomes and homologous recombination 
in plant cells can be used to isolate and identify many valuable crop traits. In 

25 addition to the use of artificial chromosomes for the isolation and testing of 
large regions of naturally occurring DNA, methods for the use of artificial 
chromosomes and cloned DNA are also contemplated. Similar to that described 
above, artificial chromsomes can be used to carry large regions of cloned DNA, 
including that derived from other plant species. 
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The ability to incorporate DNA elements into artificial chromosomes as 
they are being formed allows for the development of artificial chromosomes 
specifically engineered as a platform for testing of new genetic combinations, 
or "genomic" discoveries for model species such as Arabidopsis. Specific 
5 "recombinase* systems can be used in plant cells to excise or re-arrange genes; 
these same systems can be used to derive new gene combinations contained 
on an artificial chromosome. In this regard, it is contemplated that the use of 
site specific recombination sequences can have considerable utility in 
developing artificial chromosomes containing DNA sequences recognized by 

10 recombinase enzymes and capable of accepting DNA sequences containing 
same. The use of site-specific recombination as a means to target an 
introduced DNA to a specific locus has been demonstrated in the art and such 
methods can be employed. The recombinase systems can also be used to 
transfer the cloned DNA regions contained within the artificial chromosome to 

15 the naturally occurring plant chromosomes. 

Many site specific recombinases have been described in the literature 
(Kilby et al.. Trends in Genetics, 9(12): 413-418, 1993). Among these are: 
an activity identified as R encoded by the pSR1 plasmid of Zygosaccharomyes 
rouxii, FLP encoded for the 2um circular plasmid from Saccharomyces 

20 cerevisiae and Cre-lox from the phage P1. 

The integration function of site specific recombinases is contemplated as 
a means to assist in the derivation of genetic combinations on artificial 
chromosomes. In order to accomplish this, it is contemplated that a first step 
of introducing site-specific recombinase sites into the genome of a plant cell in 

25 an essentially random manner is conducted, such that the plant cell has one or 
more site-specific recombinase recognition sequences on one or more of the 
plant chromosomes. An artificial chromosome is then introduced into the pant 
cell, the artificial chromosome engineered to contain a recombinase recognition 
site capable of being recognized by a site specific recombinase. Optionally a 

30 gene encoding a recombinase enzyme is also included, preferably under the 
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control of an inducible promoter. Expression of the site specific recombinase 
enzyme in the plant cell, either by induction of a inducible recombinase gene, 
or transient expression of a recombinase sequence causes a site-specific 
recombination event to take place, leading to the insertion of a region of the 
5 plant chromosomal DNA containing the recombinase recognition site into the 
recombinase recognition site of the artificial chromosome, forming an artificial 
chromosome containing plant chromosomal DNA. The artificial chromosome 
can be isolated and introduced into a heterologous host, preferably a plant host, 
and expression of the newly introduced plant chromosomal DNA can be 

10 monitored and evaluated for desirable phenotypic changes. Accordingly, 
carrying out this recombination with a population of plant cells wherein the 
chromosornally located recombinase recognition site is randomly scattered 
throughout the chromosomes of the plant can lead to the formation of a 
population of artificial chromosomes, each with a different region of plant 

15 chromosomal DNA, each representing a new genetic combination. 

This particular method involves the precise site-specific insertion of 
chromosomal DNA into the artificial chromosome. This precision has been 
demonstrated in the art. For example, Fukushige and Sauer (Proc. Natl. Acad. 
Sci. USA, 89:7905-7909, 1992) demonstrated that the Cre-lox homologous 

20 recombination system could be successfully employed to introduce DNA into a 
predefined locus in a chromosome of mammalian cells. In this demonstration 
a promoter-less antibiotic resistance gene modified to include a fox sequence at 
the 5' end of the coding region was introduced into CHO cells. Cells were re- 
transformed by electroporation with a plasmid that contained a promoter with 

25 a /ox sequence and a transiently expressed Cre recombinase gene. Under the 
conditions employed, the expression of the Cre enzyme catalyzed the 
homologous recombination between the /ox site in the chromosornally located 
promoter-less antibiotic resistance gene and the fox site in the introduced 
promoter sequence leading to the formation of a functional antibiotic resistance 

30 gene. The authors demonstrated efficient and correct targeting of the 
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introduced sequence, 54 of 56 lines analyzed corresponded to the predicted 
single copy insertion of the DNA due to Cre catalyzed site specific homologous 
recombination between the fox sequences. 

The use of the same Cre-fox system has been demonstrated in plants 
5 (Dale and Ow, Gene 91:79-85, 1995) to specifically excise, delete or insert 
DNA. The precise event is controlled by the orientation of fox DNA sequences, 
in cis the fox sequences direct the Cre recombinase to either delete (fox 
sequences in direct orientation) or invert (fox sequences in inverted orientation) 
DNA flanked by the sequences, while in trans the fox sequences can direct a 

lO homologous recombination event resulting in the insertion of a recombinant 
DNA. Accordingly a fox sequence may be first added to a genome of a plant 
species capable of being transformed and regenerated to a whole plant to serve 
as a recombinase target DNA sequence for recombination with an artificial 
chromosome. The fox sequence may be optimally modified to further contain 

15 a selectable marker which is inactive but can be activated by insertion of the fox 
recombinase recognition sequence into the artificial chromosome. 

A promoterless marker gene or setectable marker gene linked to the 
recombinase recognition sequence, which is f irst inserted into the chromosomes 
of a plant cell can be used to engineer a platform chromosome. A promoter is 

20 linked to a recombinase recognition site, in an orientation that allows the 
promoter to control the expression of the marker or selectable marker gene 
upon recombination within the artificial chromosome. Upon a site-specific 
recombination event between a recombinase recognition site in a plant 
chromosome and the recombinase recognition site within the the introduced 

25 artificial chromosome, a cell is derived with a recombined artificial chromosome, 
the artificial chromosome containing an active marker or selectable marker 
acitivity that permits the identification and or selection of the cell. 

The artificial chromosomes can be transferred to other plant species and 
the functionality of the new combinations tested. The ability to conduct such 

30 an inter-chromosomal transfer of sequences has been demonstrated in the art. 
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For example, the use of the Cre-lox recombinase system to cause a 
chromosome recombination event between two chromatids of different 
chromosomes has been shown 

Any number of recombination systems may be employed (see, U.S. 
5 provisional application Serial No. filed the same day herewith under attorney 
docket no. 24601-P420). Such systems include, but are not limited to, 
bacterially derived systems such as the \n\fatt system of phage lambda and the 
Gm/gix system. 

More than one recombination system may be employed, including, for 
lO example, one recombinase system for the introduction of DNA into an artificial 
chromosome, and a second recombinase system for the subsequent transfer of 
the newly introduced DNA contained within an artificial chromosome into the 
naturally occurring chromosome of a second plant species. The choice of the 
specific recombination system used will be dependent on the nature of the 
15 modification contemplated. 

By having the ability to isolate an artificial chromosome and in particular 
artificial chromosomes containing plant chromosomal DNA introduced via site- 
specific recombination and re-introduce the chromosome into other cells, 
particularly plant cells, these new combinations can be evaluated in different 
20 crop species without the need to first isolate and modify the genes, or carry out 
multiple transformations or gene transfers to achieve the same combination 
isolation and testing combinations of the genes in plants. The use of a site 
specific recombinase and artificial chromosomes also allows the convenient 
recovery of the plant chromosomal region into other recombinant DNA vectors 
25 and systems for manipulation and study. 

The artificial chromosomes can be engineered as platforms to accept 
large regions of cloned DNA, such as that contained in Bacterial Artificial 
Chromosomes (BACs) or Yeast Artificial Chromosomes (YACs). It is further 
contemplated, that as a result of the typical structure of amplification-based 
30 artificial chromosomes, such as, for example, SATACS (or ACes), containing 
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tandemly repreated DNA blocks, that more than cloned DNA sequence can be 
introduced by recombination processes. In particular recombination within a 
predefined region of the tandemly repreated DNA within the artifical 
chromosome provides a mechanism to "stack" numerous regions of cloned 
5 DNA, including large regions of DNA contained within BACs or YACs clones. 
Thus, multiple combinations of genes can be introduced onto artificial 
chromosomes and these combinations tested for functionality. In particular, it 
is contemplated, that multiple YACs or BACs can be stacked onto an artificial 
chromsomes, the BACs or YACs containing multiple genes of complex 
10 pathways or mutlipe genetic pathways. The BACs or YACs are typically 
selected based on genetic information available within the public domain, for 
example from the Arabidopsis Information Management System 
(http://aims.cps.msu.edu/aims/index.html) or the information related to the plant 
DNA sequences available from the Institute for Genomic Research 
15 (http://www.tigr.org) and other sites known to those skilled in the art. 
Alternatively, clones can be chosen at random and evaluated for functionality. 
It is contemplated that combinations providing a desired phenotype can be 
identified by isolation of the artificial chromosome containing the combination 
and analyzing the nature of the inserted cloned DNA. 

20 In another embodiment of the methods provided herein for discovering 

genes associated with plant traits, the artificial chromosome used to transfer 
plant DNA to a host cell for evaluation therein will contain large regions of plant 
DNA, in particular plant euchromatin, as a result of the process by which the 
artificial chromosome is produced. In particular, the artificial chromosome may 

25 be an amplification-based artificial chromosome, including, but not limited to: 
(1) a minichromosome arising from breakage of a dicentric chromosome, (2) an 
artificial chromosome containing one or more regions of repeating nucleic acid 
units wherein the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid, (3) an artificial chromosome 

30 containing one or more regions of repeating nucleic acid units wherein the 
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repeat region(s) is made up predominantly of euchromatic DNA or contains 
about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or greater than 
90% euchromatic DNA, (4) an artificial chromosome containing one or more 
regions of repeating nucleic acid units wherein the artificial chromosome is 
5 made up of substantially equivalent amounts of heterochromatin and 
euchromatin, (5) an artificial chromosome that containing one or more regions 
of repeating nucleic acid units having common nucleic acid sequences that 
represent euchromatic and heterochromatic nucleic acid and (6) a sausage-like 
structure that contains a portion or all of a euchromatin-containing arm of a 

10 plant chromosome. 

In these methods for discovering genes associated with plant traits, 
because the artificial chromosome used to transfer plant DNA to a host cell for 
evaluation therein is generated to already contain large amounts of plant DNA, 
in particular plant euchromatin, there is no need to introduce plant euchromatin 

1 5 into the artificial chromosomes, by homologous or site-specific recombination. 

4. Use of artificial chromosomes for preparation and screening of 
libraries 

Since large fragments of DNA can be incorporated into artificial 
chromosomes (ACs), they are well-suited for use as cloning vehicles that can 
20 accommodate entire genomes in the preparation of genomic DNA libraries, 
which then can be readily screened for functionality as described above or for 
specific gene sequences for further modification and study. For example, it is 
possible to use artificial chromosomes to prepare artificial chromosome libraries 
containing plant genomic DNA library useful in the identification and isolation 
25 of functional DNA components such as genes, centromeric DNA and telomeric 
DNA from a variety of different species of plants. 

The following examples are included for illustrative purposes only and are 
not intended to limit the scope of the invention. 

Example 1 

30 Generation of Arabidopsis protoplasts 
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Plant protoplasts are typically generated from plant cells following 
standard techniques (for example, Maheshwari et al., Crit. Rev. Plant Sci. 
14: 149-1 78, 1995; Ramulu era/.. Methods in Molecular Biology 111 227-242, 
1999). Typically plant protoplasts are prepared from fresh plant tissue, e.g., 
5 leaf, or can be prepared by converting cell suspension cultures to protoplasts 
by removal of the cell walls enzymatically. For production of Arabidopsis 
protoplasts, the methods of Karesh etaf. (Plant Cell Reports 9: 575-578, 1 991 ) 
and Mathur eta/. {Plant Cell Reports /4.21-226, 1995) were used to generate 
Arabidopsis suspension cultures by modifications thereof as described below. 
10 These cells were maintained in liquid culture and subcultured as required, 
usually between 7 and 10 days in culture. 

Establishment of suspension cultures 

Cell suspension cultures derived from root callus of Arabidopsis thaliana 
cv. Columbia, RLD and Landsburg I erecta'were used. Calli were induced from 
15 roots of 3 week-old seedlings on callus induction medium containing MS basic 
media {Murashige and Skoog (1962) Physiol. Plant / 5:473-497) with 3% 
sucrose, 0.5mg/l napthalene acetic acid (NAA), 0.05 mg/l Kinetin (Sigman 
Aldrich Canada). The cell suspension cultures were grown from the calli in 
liquid callus induction medium at 22°C with shaking at 120 rpm. They were 
20 subcultured every 7 days. 

Generation of protoplasts 

One gram of 4-5 day-old suspension culture was incubated in 6 ml 
enzyme solution containing 1% Cellulase 'Onozuka' R-10 and 0.25% 
Macerozyme R-10 in 35 g/l CaCI 2 -2H 2 0 (Hartmann et aL (1998) Plant Mo I. Biol. 

25 36:741 -754) and incubated at 22°C in the dark with shaking at 70 rpm for 1 5 
h. The protoplast mixture was poured through a 100//m nylon mesh sieve and 
centrifuged at 250xg for 5 min. The protoplasts were washed with 35 g/l 
CaCI 2 -2H 2 0 and resuspended in 10 ml floating medium containing B5 medium 
(Gamborg et aL (1968) Exp. Cell Res. 50: 151-1 58) with 144 g/l sucrose and 1 

30 mg/l 2,4-dichlorophenoxyacetic acid (2,4-D). The protoplasts were centrifuged 
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at 80xg for 10 min, collected at the interface and used immediately for 
transfection. 

Example 2 

Generation of Tobacco Mesophyll Protoplasts 

5 Mesophyll protoplasts were generated from leaves of sterile plantlets of N. 

tabacum cv. Xanthi. The plantlets were grown aseptically on MSO medium (MS 
basal media, 3% sucrose, 0.05% morpholinoethanesulfonic acid (MES), 1.0 
mg/l benzyl adenine (BA), 0.1 mg/l NAA and 0.8% agar, pH 5.8) at 22°C under 
a 16/8 h photoperiod (see also Bilang et al. (1994) Plant Molecular Biology 

10 Manual A 7:1 -6). Fully expanded leaves (2x4 cm) were cut in half, the main 
vein removed and the upper epidermis scored with parallel cuts. Leaf pieces 
were immersed in 6 ml enzyme solution containing 1.2% Cellulase 'Onozuka' 
R-10 and 0.4% Macerozyme R-10 in K4 medium (Nagy and Maliga (1976) Z. 
PflanzenpysioL 75:453-455) and incubated at 22°C for 1 5 h without shaking. 

15 The protoplasts were purified by pouring through a 100^/m nylon mesh sieve. 
Suspension of protoplasts was carefully overlayed with 1 ml W5 solution (Bilang 
et al. (1994) Plant Molecular Biology Manual A 7:1-6) and centrifuged at 80xg 
for 10 min. Protoplasts were then resuspended in W5 solution at a density of 
1 x 10 6 protoplasts/ml and stored at 4°C for 1 to 2 hours prior to treatment, for 

20 example, DNA uptake or chromosome transfer. 

Example 3 

Production of Tobacco Protoplasts from Suspension Cultures 

Tobacco BY-2 protoplasts are prepared from suspension cultures according 
to the method of Nagata et al. 1(1981) Molecular and General Genetics, 
25 754:161-165]. 

Example 4 

Generation of Brassica Hypocotyl Protoplasts 

Genotypes of Brassica napus, B. oleracea, B. juncea and B. carinata may 
be used to generate protoplasts. Seeds of Brassica napus were 
30 surface-sterilized (for 2 min with 70% ethanol, then for 20 min with 2.4% 
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sodium hypochlorite containing one drop of Tween 20 per 1 00 ml). Seeds were 
rinsed thoroughly with sterile distilled water and grown aseptically on 
autoclaved germination medium (half-strength basal Murashige and Skoog's 
medium (MS), 1% sucrose, 0.8% agar, pH 5.8). Unless otherwise indicated, 
5 the protoplast generation procedures were performed aseptically and solutions 
and media were f ilter-sterilized. Alternatively, protoplasts can be generated and 
cultured successfully from different explants using various protocol 
modifications (for example, Kao et al. (1991) Plant Science 75:63-72; Kao et 
al. (1990) Plant Cell Rep. 5:311-315; Kao and Seguin-Swartz (1 987) Plant Cell 
10 Tiss. Org. Cult. 70:79-90; Kao (1977) MoL Gen. Genet. / 50:225-230). 
Generation of Hypocotyl Protoplasts 

Hypocotyls were excised from 4 or 5 day-old seedlings grown aseptically 
in the dark with or without light exposure for a few hours prior to use. The 
explants were cut transversely into 2-5 mm pieces and incubated in enzyme 

15 solution (salts, vitamins and organic acids of Kao's medium (Kao (1977) Mol. 
Gen. Genet. 750:225-230), 0.4 g/l CaCI 2 -2H 2 0, 13% sucrose, 1% 
Cellulase'Onozuka R10', 0.1% Pectolyase Y23, pH 5.6) in petri dishes, in 
darkness, without agitation for 14-18 hours, then with agitation on a rotary 
shaker (ca. 50 rpm) for 1 5-30 min. 

20 The mixture was filtered through a 63 prn nylon screen into centrifuge 

tubes, and an equal volume of 17.5% sucrose was added to each tube. 
Following centrifugation (ca. lOOxg, 8 min), the protoplast band that formed at 
the top of each tube was collected. Protoplasts were washed 3 times by 
resuspension in wash solution [solution W5 of Menczel and Wolfe (1 984, Plant 

25 Cell Pep 3:196-198) at a reduced strength (0.8X)] followed by centrifugation 
at 100xg for 3-5 min and discarding the supernatant. 

Protoplasts were cultured in Kao's medium containing the salts, vitamins 
and organic acids with 30 g/l sucrose, 68.4 g/l glucose, 0.5 mg/l NAA, 0.5 mg/l 
BA, 0.5 mg/l 2,4-D, pH 5.7, at a density of 1 X 1 O 5 per ml and incubated at 

30 25 °C, 16 h photoperiod, in dim fluorescent light (25 pEm' 2 s 1 ). 
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After 5-8 days in culture, 1-1 .5 ml of feeder medium containing the above 
medium except with 55.8 g/l glucose instead of 68.4 g/l, were added to each 
dish, and the dishes were placed under brighter fluorescent light (50/iEm~ 2 s" 1 ). 
At about 14 days, 1-2 ml of medium were removed from each dish, and 2-3 ml 
5 of feeder medium containing basal B5 medium (Gamborg eta/. (1968) Exp. Cell 
Res. 50:151-158), 3% sucrose, 3.8% glucose, 0.5 mg/l BA, 0.5 mg/l NAA, and 
0.5 mg/l 2,4-D, pH 5.7, were added. At about 21 days, if microcolonies have 
not yet formed, the cultures can be fed with the last feeder medium except with 
2.2% glucose instead of 3.8%. Protoplast cultures can be washed when 
10 necessary by adding new feeder medium, gently swirling petri dishes, allowing 
cells to settle, removing most of the supernatant and adding fresh medium to 
the dishes. 

At 3-5 weeks, microcolonies were embedded with medium containing a 1 : 1 
mixture of the last feeder medium and proliferation medium which contains the 
15 components of the feeder medium with 0.9% glucose and 1.6% agarose to 
make a concentration of 0.8% in the final mixture. Cultures were incubated as 
described above in bright fluorescent light (80-100 jjEm 2 s" 1 ). After 10days-2 
weeks, green colonies were plated onto the regeneration medium. 

Example 5 

20 Preparation of a Transformation Vector Useful for the Induction of 

Plant Artificial Chromosome Formation 

Plant artificial chromosomes (PACs) can be generated by introducing 

nucleic acid, such as DNA, which can include an amplification-inducing DNA 

and/or a targeting DNA, for example rDNA or lambda DNA, into a plant cell, 

25 allowing the cell to grow, and then identifying from among the resulting cells 
those that include a chromosome with a structure that is distinct from that of 
any chromosome that existed in the cell prior to introduction of the nucleic acid. 
The structure of a PAC reflects amplification of chromosomal DNA, for example, 
segmented, repeat region-containing and heterochromatic structures. It is also 

30 possible to select cells that contain structures that are precursors to PACs, for 
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example, chromosomes containing more than one centromere and/or fragments 
thereof, and culture and/or manipulate them to ultimately generate a PAC within 
the cell. 

In the method of generating PACs, the nucleic acid can be introduced 
5 into a variety of plant cells. The nucleic acid can include targeting DNA and/or 
a plant expressable DNA encoding one or multiple selectable markers (e.g. , DNA 
encoding bialophos (bar) resistance) or scorable markers {e.g., DNA encoding 
GFP). Examples of targeting DNA include, but are not limited to, N. tabacum 
rDNA intergenic spacer sequence (IGS) and Arabidopsis rDNA such as the 1 8S, 

lO 5.8S, 26S rDNA and/or the intergenic spacer sequence. The DNA can be 
introduced using a variety of methods, including, but not limited to 
Agrobacterium-med\ated methods, PEG-mediated DNA uptake and 
electroporation using, for example, standard procedures according toHartmann 
et al [(1 998) Plant Molecular Biology 35:741 ]. The cell into which such DNA 

15 is introduced can be grown under selective conditions and can initially be grown 
under non-selective conditions and then transferred to selective media. The 
cells or protoplasts can be placed on plates containing a selection agent to 
grow, for example, individual calli. Resistant calli can be scored for scorable 
marker expression. Metaphase spreads of resistance cultures can be prepared, 

20 and the metaphase chromosomes examined by FISH analysis using specific 
probes in order to detect amplification of regions of the chromosomes. Cells 
that have artificial chromosomes with functioning centromeres or artificial 
chromosomal intermediate structures, including, but not limited to, dicentric 
chromosomes, formerly dicentric chromosomes, minichromosomes, 

25 heterochromatin structures (e.g. sausage chromosomes), and stable self- 
replicating artificial chromosomal intermediates as described herein, are 
identified and cultured. In particular, the cells containing self-replicating artificial 
chromosomes are identified. 
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The DNA introduced into a plant cell for the generation of PACs can be 
in any form, including in the form of a vector. An exemplary vector for use in 
methods of generating PACs can be prepared as follows. 

For the production of artificial chromosomes, plant transformation 
5 vectors, as exemplified by pAglla and pAgllb, containing a selectable marker, 
a targeting sequence, and a scorable marker were constructed using procedures 
well known in the art to combine the various fragments. The vectors can be 
prepared using vector pAg1 as a base vector and inserting the following DNA 
fragments into pAgl: DNA encoding /?-glucoronidase under the control of the 

10 nopaline synthase (NOS) promoter fragment and flanked at the 3' end by the 
NOS terminator fragment, a fragment of mouse satellite DNA and an N. 
tabacum rDNA intergenic spacer sequence (IGS). In constructing plant 
transformation vectors, vector pAg2 can also be used as the base vector. 
1. Construction of pAGI 

15 Vector pAgl (SEQ. ID. NO: 1; see Figure 1 ) is a derivative of the 

CAMBIA vector named pCambia 3300 (Center for the Application of Molecular 
Biology to International Agriculture, i.e., CAMBIA, Canberra, Australia; 
www.cambia.org), which is a modified version of vector pCambia 1300 to 
which has been added DNA from the bar gene confering resistance to 

20 phosphinothricin. The nucleotide sequence of pCambia 3300 is provided in 
SEQ. ID. NO: 2. pCambia 3300 also contains a lacZ alpha sequence containing 
a polylinker region. 

pAgl was constructed by inserting two new functional DNA fragments 
into the polylinker of pCambia 3300: one sequence containing an attB site and 

25 a promoterless zeomycin resistance-encoding DNA flanked at the 3' end by a 
SV40 polyA signal sequence, and a second sequence containing DNA from the 
hygromycin resistance gene (hygromycin phosphotransferase) confering 
resistance to hygromycin for selection in plants. Although the zeomycin-SV40 
polyA signal fusion is not expected to provide the basis for zeomycin selection 

30 in plant cells, it can be activated in mammalian cells by insertion of a functional 
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promoter element into the attB site by site-specific recombination catalyzed by 
the Lambda att integrase. Thus, the inclusion of the attB-zeomycin sequences 
allows for evaluation of functionality of plant artificial chromosomes in 
mammalian cells by activation of the zeomycin resistance-encoding DNA, and 
5 provides an att site for further insertion of new DNA sequences into plant 
artificial chromosomes formed as a result of using pAg1 for plant 
transformation. The second functional DNA fragment allows for selection of 
plant cells with hygromycin. Thus, pAgl contains DNA from the bar gene 
confering resistance to phosphinothricin, DNA from the hygromycin resistance 

10 gene, both resistance-encoding DNAs under the control of a separate 
cauliflower mosaic virus (CaMV) 35S promoter, and the attB-promoterless 
zeomycin resistance-encoding DNA. 

pAgl is a binary vector containing Agrobacterium right and left T-DNA 
border sequences for use in Agrobacterium-med'iated transformation of plant 

1 5 cells or protoplasts with the DNA located between the border sequences. pAgl 
also contains the pBR322 Ori for replication in E.coli. pAgl was constructed 
by ligating ////?crlll/Psfl-digested p3300attBZeo with /y/ncflll/Psrl-digested 
pBSCaMV35SHyg as follows (see Figure 2). 
a. Generation of p3300attBZeo 

20 Plasmid pCambia 3300 was digested with PstWEc/l 36 II and ligated with 

Psfl/Sft/l-digested pLITattBZeo (the nucleotide sequence of pLITattBZeo is 
provided in SEQ. ID. NO: 19 to generate p3300attBZeo which contains an attB 
site, a promoterless zeomycin resistance-encoding DNA flanked at the 3' end 
by a SV40 polyA signal, and a reconstructed Pst\ site. 

25 b. Generation of pBSCaMV35SHyg 

A DNA fragment containing DNA encoding hygromycin 
phosphotransferase flanked by the CaMV 35S promoter and the CaMV 35S 
polyA signal sequence was obtained by PCR amplification of plasmid pCambia 
1302 (GenBank Accession No. AF234298 and SEQ. ID. NO: 3). The primers 

30 used in the amplification reaction were as follows: 
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CaMV35SpolyA: 

5'-CTGAATTAACGCCGAATTAATTCGGGGGATCTG-3' SEQ. ID. NO: 4 
CaMV35Spr: 

5'-CTAGAGCAGCTTGCCAACATGGTGGAGCA-3' SEQ. ID. NO: 5 
5 The 2 1 OO-bp PCR fragment was ligated with fcoRV-digested pBluescript II SK + 
(Stratagene, La Jolla, CA, U.S.A.) to generate pBSCaMV35SHyg. 
c. Generation of pAgl 

To generate pAgl, pBSCaMV35SHyg was digested with Hind\\\/Pst\ and 
ligated with Hind\\\/Pst\-d\Qested p3300attBZeo. Thus, pAgl contains the 

1 0 pCambia 3300 backbone with DN A conferring resistance to phophinothricin and 
hygromycin under the control of separate CaMV 35S promoters, an attB- 
promoterless zeomycin resistance-encoding DNA recombination cassette and 
unique sites for adding additional markers, e.g., DNA encoding GFP. The attB 
site facilitates the addition of new DNA sequences to plant or animal, e.g., 

15 mammalian, artificial chromosomes, including PACs formed as a result of using 
the pAgl vector, or derivatives thereof, in the production of PACs. The attB 
site provides a convenient site for recombinase-mediated insertion of DNAs 
containing a homologous att site. 
2. pAG2 

20 The vector pAg2 (SEQ. ID. NO: 6; see Figure 3) is a derivative of vector 

pAgl formed by adding DNA encoding a green fluorescent protein (GFP), under 
the control of a NOS promoter and flanked at the 3' end by a NOS polyA signal, 
to pAgl . pAg2 was constructed as follows (see Figure 4). A DNA fragment 
containing the NOS promoter was obtained by digestion of pGEM-T-NOS, or 

25 pGEMEasyNOS (SEQ. ID. NO: 7), containing the NOS promoter in the cloning 
vector pGEM-T-Easy (Promega Biotech, Madison, Wl, U.S.A.), with XbaMNcoX 
and was ligated to an XbaMNcoX fragment of pCambia 1 302 containing DNA 
encoding GFP (without the CaMV 35S promoter) to generate p1302NOS {SEQ. 
ID. NO: 8) containing GFP-encoding DNA in operable association with the NOS 

30 promoter. Plasmid p!302NOS was digested with Sma\/Bsf\N\ to yield a 
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fragment containing the NOS promoter and GFP-encoding DNA. The fragment 
was ligated with P/nel/Bs/WI-digested pAgl to generate pAg2. Thus, pAg2 
contains DNA from the bar gene confering resistance to phosphinothricin, DNA 
conferring resistance to hygromycin, both resistance-encoding DNAs under the 
5 control of a cauliflower mosaic virus 35S promoter, DNA encoding kanamycin 
resistance, a GFP gene under the control of a NOS promoter and the attB- 
zeomycin resistance-encoding DNA. One of skill in the art will appreciate that 
other fragments can be used to generate the pAgl and pAg2 derivatives and 
that other heterlogous DNA can be incorporated into pAgl and pAg2 derivatives 

10 using methods well known in the art. 

3. pAglla and pAgtlb transformation vectors 

Vectors pAglla and pAgllb were constructed by inserting the following 
DNA fragments into pAgl: DNA encoding /?-glucoronidase, the nopaline 
synthase terminator fragment, the nopaline synthase (NOS) promoter fragment, 

15 a fragment of mouse satellite DNA and an N. tabacum rDNA intergenic spacer 
sequence (IGS). The construction of pAglla and pAgllb was as follows (see 
Figure 5). 

An N. tabacum rDNA intergenic spacer (IGS) sequence (SEQ. ID. NO: 9); 
see also GenBank Accession No. Y08422; see also Borysyuk et a/. (2O0O) 

20 Nature Biotechnology 7S: 1 303-1 306; Borysyuk et a/. (1997) Plant Mo/. 
Btol. 35:655-660; U.S. Patent Nos. 6, 100,092 and 6,355,860) was obtained by 
PCR amplification of tobacco genomic DNA. The IGS can be used as a 
targeting sequence by virtue of its homology to tobacco rDNA genes; the 
sequence is also an amplification promoter sequence in plants. This fragment 

25 was amplified using standard PCR conditions (e.g. , as described by Promega 
Biotech, Madison, Wl, U.S.A.) from tobacco genomic DNA using the primers 
shown below: 
NTIGS-FI 

5'- GTG CTA GCC AAT GTT TAA CAA GAT G- 3' (SEQ ID No. 10) and 
30 NTIGS-RI 
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5'-ATG TCT TAA AAA AAA AAA CCC AAG TGA C- 3' (SEQ ID No. 11) 
Following amplification, the fragment was cloned into pGEM-T Easy to give 
pIGS-L 

A fragment of mouse satellite DNA (Msatl fragment; GenBank Accession 
5 No. V00846; and SEQ ID No. 1 2) was amplified via PCR from pSAT-1 using the 
following primers: 
MSAT-F1 

5'- AAT ACC GCG GAA GCT TGA CCT GGA ATA TCG C -3'(SEQ ID No. 13) 
and 

10 MSAT-Ri 

5'-ATA ACC GCG GAG TCC TTC AGT GTG CA T- 3' (SEQ ID No. 14) 
This amplification added a SacU and a Hind\\\ site at the 5'end and a SacXX site 
at the 3' end of the PCR fragment. This fragment was then cloned into the 
SacXX site in plGS-1 to give pMIGS-1 , providing a eukaryotic centromere-specific 

15 DNA and a convenient DNA sequence for detection via FISH. 

A functional marker gene containing a NOS-promoter:GUS:NOS 
terminator fusion was then constructed containing the NOS promoter (GenBank 
Accession No. U09365; SEQ ID No. 15), E. coli ^-glucuronidase coding 
sequence (from the GUS gene; GenBank Accession No. S69414; and SEQ ID 

20 No. 16), and the nopaline synthase terminator sequence (GenBank Accession 
No. U09365; SEQ ID No. 18). The NOS promoter in pGEM-T-NOS was added 
to a promoterless GUS gene in pBlueScript (Stratagene, La Jolla, CA, U.S.A.) 
using NotXISpeX to form pNGN-1, which has the NOS promoter in the opposite 
orientation relative to the GUS gene. 

25 pMIGS-1 was digested with NotMSpeX to yield a fragment containing the 

mouse major satellite DNA and the tobacco IGS which was then added to NotX- 
digested pNGN-1 to yield pNGN-2. The NOS promoter was then re-oriented to 
provide a functional GUS gene, yielding pNGN-3, by digestion and religation 
with Spe\. Plasmid pNGN-3 was then digested with HindXXX, and the HindWX 

30 fragment containing the ^-glucuronidase coding sequence and the rDNA 
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intergenic spacer, along with the Msat sequence, was added to pAG-1 to form 
pAglla, using the unique Hind\\\ site in pAg1 located near the right T-DNA 
border of pAgl, within the T-DNA region. 

Another plasmid vector, referred to as pAgllb, was also recovered, which 
5 contained the inserted Hind\\\ fragment in the opposite orientation relative to 
that observed in pAglla. Thus, pAglla and pAgllb differ only in the orientation 
of the Hind\\\ fragment containing the mouse major satellite sequence, the GUS 
DNA sequence and the IGS sequence (see Figure 6). The nucleotide sequence 
of pAglla is provided in SEQ. ID. NO: 21. 

10 Vectors pAgl, pAg2, pAglla and pAgllb, as well as similarly designed 

vectors containing a recombination site and a promoter (e.g. , plant or animal 
promoter), and possibly other regulatory sequences, inoperable association with 
DNA encoding a protein or other product for the expression in a host cell, such 
as a plant or animal cell, can be used in the transfer of any protein (or other 

15 product)-encoding nucleic acid of interest into a cell for expression thereof. For 
example, any protein (or other product)-encoding nucleic acid of interest (in 
operable association with transcriptional regulatory suitable for use in a 
particular host cell) can be inserted into any of the vectors pAgl, pAg2, pAglla 
and pAgllb and thereby incorporated into a plant, animal or other artificial 

20 chromosome, particularly a platform artificial chromosome ACes, as desribed 
herein. 

Example 6 

Agrobacterium-Medtated Transformation of Plant Cells 
Plant cells were transformed via Agrobacterium-rr\ed\ated transformation 
25 according to standard procedures (see, for example, Horsch eta/. (1988) Plant 
Molecular Biology Manual, A5A-3, Kluwer Academic Publisher, Dordrecht, 
Belgium). Briefly, Agrobacterium strain GV 3101/pMP90 (see Koncz and Schell 
(1986) Molecular and General Genetics 204:383-396) was transformed with 
pAglla and pAgllb (see Example 5) by heat shock, and the plasmid integrity of 
30 pAglla and pAgllb after transformation was verified by Hind\\\ digest pattern. 
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pAglla/pMP90 or pAgllb/pMP90 were cultured in 5 ml AB minimum medium 
(Horsch eta/. (1 988) Plant Molecular Biology Manual, A5*A -9, Kluwer Academic 
Publisher, Dordrecht, Belgium) containing 25 //g/ml kanamycin and 25 /yg/ml 
gentamycin at 28°C for two days. 
5 Leaf disks of tobacco and Arabidopsis and root segments of Arabidopsis 

were prepared as follows: tobacco leaves from 3 to 4 week-old explants were 
cut into 1 cm in diameter, and Arabidopsis leaves were taken from 3 week-old 
seedlings and transversely cut in two halves. Roots of 3 week-old Arabidopsis 
were excised into segments of 1 cm in length. Cocultivation was carried out 

10 by immersing leaf disks or root segments in bacterial culture for 2 minutes and 
then transferring the infected tissues to culture medium without antibiotics for 
2 days at 22°C for 1 6-hours/day under cool white fluorescent light. The leaf 
disks of tobacco and Arabidopsis were cultured on MS 104 medium (MS, 3% 
sucrose, 0.05% MES, 1 .0 mg/l BA, 0.1 mg/l NAA and 0.8% agar, pH 5.8) and 

15 root segments on callus-inducing medium, CIM 0.5/0.05 (B5, 2% glucose, 
0.05% MES, 0.5 mg/l 2,4-D, 0.05 mg/l kinetin and 0.8% agar, pH 5.8). 

The transformed leaf disks and root segments were then transferred to 
selection medium of MS104 or CIM 0.5/0.05, respectively, containing 20 mg/l 
hygromycin and 300 mg/l Timentin for the elimination of Agrobacterium. The 

20 selection medium was refreshed every two weeks and green shoots 
regenerated. Plants were analyzed for the expression of the DNA encoding GUS 
by standard histochemical and fluorescent assays and evidence of amplification 
of the inserted DNA by quantitative PCR. Numerous plants were obtained that 
expressed high levels of GUS, and multiple copies of the GUS gene were 

25 observed by Fluorescent In Situ Hybridization (FISH) and PCR analysis. Thus, 
amplification the chromosomal regions containing the inserted DNA was 
observed. One of skill in the art will appreciate that GUS expression, or the 
expression of any other gene, can be assessed using methods well known in the 
art. 

30 Example 7 
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Transfection and culture of Arabidopsis protoplasts 

E. coli strain Stb14 (Gibco Life Sciences) was transformed with pAglla, 
pAgllb, and one of two targeting plasmids containing the rDNA repeat sequence 
from Arabidopsis (plasmid pJHD-14A or the 26S rDNA from Arabidopsis ptasmid 
5 pJHD2-19A, as described by Doelling et ai. [(1993) Proc. Natl. Acad. Sci. 
U.S.A. 90:7528-7532]) via electroporation according to standard procedures. 
A single colony was grown up in 250 ml LB medium containing 50 pg/ml 
kanamycin (for selection based on the kanamycin resistance-encoding DNA in 
pAglla and pAgllb) or 50jug/ml ampicillin (for selection based on the ampicillin 

10 resistance-encoding DNA in pJHD-14A & pJHD2-19A) and cultured at 30°C 
with shaking at 225 rpm for 1 6 hours. The plasmids were isolated according to 
standard procedures well known in the art. The structural integrity of the 
plasmids was checked by restriction digestion pattern, and the plasmids were 
linearized with restriction enzymes. Plasmids were sterilized with chloroform 

15 and 70% ethanol before use for transfection. 

Arabidopsis protoplasts were resuspended in the culture medium (see 
Example 1) at a density of 2 x 1 0 6 protoplasts/ml. A 300 p\ protoplast 
suspension was pipetted into a 15 ml tube, and 30 jj\ of plasmid (pAglla or 
pAgllb) and targeting DNA (pJHD-14A or pJHD2-19A) was added containing 

20 10/yg plasmid and 100 jt/g targeting sequence followed immediately by slowly 
adding 300 p\ of 10% PEG. The targeting plasmids were included in the 
transfection procedure in order ensure that the amount of rDNA targeting DNA 
(i.e., tobacco rDNA from pAglla or b and Arabidopsis DNA from the targeting 
vectors) was sufficient to effect recombination of the introduced DNA at a 

25 homologous site in an Arabidopsis chromosome. DNA was typically used in a 
ratio of 10:1, targeting DNA (pJHD-14A or pJDH2-19A, or Lambda DNA) to 
plasmid DNA (pAglla or pAgllb, or a selectable marker plasmid), or in a ratio of 
5:1 . Generally, the number of base pairs of targeting DNA to be sufficient for 
insertion into a plant chromosome is at least about 50 bp, or about 60 bp, or 

30 about 70 bp, or about 80 bp, or about 90 bp, or about 100 bp, or about 1 50 
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bp, or about 200 bp, or about 300 bp, or about 400 bp, or about BOO bp, or 
about 600 bp, or about 700 bp, or about 800 bp, or about 900 bp, or about 1 
kb, or about 2 kb or about 3 kb, or about 4 kb, or about 5 kb, or about 6 kb, 
or about 7 kb, or about 8 kb, or about 9 kb, or about 10 kb or more. The 
5 amount and length of targeting DNA sufficient to effect introduction into a 
chromosome can be determined empirically and can vary for different plant 
species. 

The mixture was shaken gently, and immediately 300 jj\ of 1 0% PEG 
solution was added slowly with gentle shaking. The protoplast mixture was 

10 incubated at 22°C for 10-15 min with several cycles of gentle shaking. DNA 
uptake was quenched by the addition of 5 ml 72.4 g/l Ca(N0 3 ) 2 . The 
protoplasts were then centrifuged at 80xg for 7 min and resuspended in culture 
medium. For selection, 10 to 40 mg/l hygrornycin was added to protoplast 
cultures 1 4 days aftertransf ection, and the culture medium was refreshed every 

15 7 days. The protoplast cultures could also be selected after embedding in 0.6% 
agarose by transferring to a culture medium containing 20 mg/l hygrornycin. The 
cultures were incubated for 14 days or longer at 22°C. 

The Arabidopsis protoplasts were analyzed for the presence and 
expression of the DNA encoding GUS. Recovered microcalli strongly expressed 

20 GUS and were resistant to selective agents, indicating amplification of the 
inserted DNA. Alternatively, the transfection of Arabidopsis protoplasts can 
be conducted without using targeting DNA sequences since pAglla and pAgllb 
include a region of rDNA (i.e. the tobacco rDNA IGS) that can act as a targeting 
sequence as long as a sufficient amount of pAglla/b plasmid is used in the 

25 transfection procedure. Example 8 

Transfection and Culture of Tobacco Protoplasts 
As described in Example 7, E. co// strain Stbl4 was transformed with pAglla, 
pAgllb, pJHD-14A (targeting DNA) and pJHD2-19A (targeting DNA) via 
electroporation, and plasmid DNA was recovered and linearized with restriction 
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enzymes. Plasmids were sterilized with chloroform and 70% ethanol before use 
for transfection. 

The tobacco protoplasts (see Examples 2 and 3) were resuspended in the 
culture medium (see Example 2) at a density of 2 x 10 6 protoplasts/ml. A 300 
5 fj\ protoplast suspension was pipetted into a 15 ml tube, and 30 jj\ of plasmid 
and targeting DNA was added as described in Example 7. The mixture was 
shaken gently, and immediately 300 //I of 10% PEG solution was added slowly 
with gentle shaking. The tobacco protoplast mixture was incubated at 22 °C 
for 10-15 min with several cycles of gentle shaking. DNA uptake was 

10 quenched by the addition of 5 ml 72.4 g/L Ca(N0 3 ) 2 . The protoplasts were then 
centrifuged at 80xg for 7 min and resuspended in culture medium. 

The recovery of viable tobacco protoplasts following DNA uptake ranged 
from 65-75% following treatment. Typically greater than 35% of the 
protoplasts initiated cell division within 7 days of treatment. Protoplast cells 

15 were analyzed for gene expression (in this case for the expression of the 
reporter DNA GUS, but alternatively, the expression of other genes can be 
monitored). Between 4% and 6% of the recovered cells exhibited GUS 
expression. 

The protoplasts were subject to selection procedures to recover 
20 transformed cells. For selection of tobacco cells, 10 to 40 mg/l hygromycin 
was added to protoplast cultures 10-14 days after transfection, and the culture 
medium was refreshed every 7 days. Leaf disc selection was performed in the 
presence of 40 mg/l hygromycin. Transformed microcalli were recovered and 
analyzed for the expression of the GUS reporter gene. GUS positive calli were 
25 isolated and subjected to FISH analysis (see Example 13). Plant cells that 
exhibited amplification of the inserted DNA were identified. 

Example 9 

Transfection and Culture of Brassica Protoplasts 

Brassica protoplasts (see Example 4), following the final washing step 
30 after filtering through a 63 /ym nylon screen and centrifugation, are collected 
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and used for DNA transfection as described in Example 8. Brassica protoplast 
cultures following DNA uptake or transformation by Agrobacterium can be 
selected with either hygromycin or glufosinate ammonium in liquid culture or in 
embedded semi-solid cultures. The effective concentration of hygromycin is 10 
5 to 40 mg/l for 2 to 4 weeks or continuously, whereas that for glufosinate 
ammonium is 2 to 60 mg/l for 5 days to 2 weeks. Selection can impede growth, 
and additional transfers to similar media may be required. 

Example 10 
Plant Regeneration from Brassica Protoplasts 

10 Colonies of Brassica protoplasts (1 mm or larger in diameter) are plated 

onto regeneration medium (basal Murashige and Skoog's medium, 1 % sucrose, 
2 mg/l BA, 0.01 mg/l NAA, 0.8% agarose, pH 5.6). Cultures are incubated 
under the conditions described in Example 4. Cultures are transferred onto 
fresh regeneration medium every 2 weeks. Regenerated shoots are transferred 

15 onto autoclaved rooting medium (basal Murashige and Skoog's medium, 1% 
sucrose, 0.1 mg/l NAA, 0.8% agar, pH 5.8) and incubated under dim 
fluorescent light (25 /vEm 2 s" 1 ). Plantlets are potted in a soil-less mix (for 
example. Terra-lite Redi-Earth, W.R. Grace & Co., Canada Ltd., Ajax, Ontario) 
containing fertilizer (Nutricote 1414-14 type 100, Plant Products Co. Ltd, 

20 Brampton, Ontario) and grown in a growth room (20°C/15°C, 16 h 
photoperiod, 100-140/vEm 2 s" 1 ) with fluorescent and incandescent light at soil 
level. Plantlets are covered with transparent plastic cups for one week to allow 
for acclimatization. 

Example 1 1 

25 Isolation of Nuclei from Protoplasts 

To facilitate analysis, plant cells can be subjected to nuclei isolation, and 
the isolated nuclei can be analyzed by FISH or PCR. To isolate the nuclei, 
protoplast calli were reprotoplasted according to the procedure of Mathur era/, 
with modifications (see Mathur eta/. Plant Cell Report (1995) 14: 221-226). 
30 The protoplast calli were digested with 1.2% Cellulase 'Onozuka' R-10 and 
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0.4% w/v Macerozyme R-10 in nuclei isolation buffer (10 mM MES-pH 5.5, 
0.2M sucrose, 2.5 mM EDTA, 2.5 mM DTT, 0.1 mM spermine, 10 mM NaCI, 
10 mM KCI and O.I 5% Triton X-100) for 3 hours. After centrif ugation at 80 
x g for 10 minutes, the pellets of protoplasts were resuspended in hypertonic 
5 buffer of 1 2.5% W5 solution (Hinnisdaels et al. (1 994) Plant Molecular Biology 
Manual G2:1-13, Kluwer Academic Publisher, Belgium) for 10 minutes. To 
promote disruption of protoplasts, the protoplast suspension was forced through 
a syringe needle four times. The disrupted protoplasts were filtered through 5 
jjm meshes to remove debris and centrif uged at 200 x g for 10 min. By 

10 repeated washing of the pellet in a nuclei isolation buffer containing 
phenylmethylsulfonylfluoride (PMSF) and centrif ugation at 200 x g for 10 
minutes, nuclei were collected as a white pellet freed from cytoplasm 
contamination and cellular debris. Samples were fixed in 3:1 methanohglacial 
acetic acid and were analyzed by FISH. 

1 5 Example 1 2 

Mitotic Arrest of Plant Cells for Detection of Amplification and 
Artificial Chromosome Formation 

In general, plant cells or protoplasts are typically cultured for two or more 

generations prior to mitotic arrest. Typically, 5/yg/ml colchicine is added to the 

20 cultures for 12 hours to accumulate mitotic plant cells. The mitotic cells are 
harvested by gentle centrifugation. Alternatively, plant cells (grown on plastic 
or in suspension) can be arrested in different stages of the cell cycle with 
chemical agents other than colchicine, such as, but not limited to, hydroxyurea, 
vinblastine, colcemid or aphidicolin or through the deprivation of nutrients, 

25 hormones, or growth factors. Chemical agents that arrest the cells in stages 
other than mitosis, such as, but not limited to, hydroxyurea and aphidicolin, are 
used to synchronize the cycles of all cells in the population and are then 
removed from the cell medium to allow the cells to proceed, more or less 
simultaneously, to mitosis at which time they can be harvested to disperse the 

30 chromosomes. 
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Example 13 

Detection of Amplification and Artificial Chromosome Formation by 
Fluorescence in situ hybridization (FISH) 

A variety of plant cells can analyzed by fluorescence in situ hybridization 

5 (FISH) methods (Fransz eta/. (1996) Plant J. 9:421-430; Fransz eta/. (1998) 

P/ant J. 73:867-376; Wilkes et a/. (1995) Chromosome Research 3:466-472; 

Busch eta/. (1994) Chromosome Research 2:15-20; Nkongolo (1993) Genome 

35:701-705; Leitch et aL (1994) Methods in Molecular Biology 2^:177-185; 

Murata et aL. (1997) Plant J. 72:31-37) to identify amplification events and 

10 artificial chromosome formation. 

FISH is used to detect specific DNA sequences on chromosomes, in 
particular to detect regions of plant chromosomes that have undergone 
amplification as a result of the introduction of heterologous DNA as described 
herein, or to detect artificial chromosome formation in plant cells. FISH 

15 chromosome spreads of Arabidopsis and tobacco plant cells into which 
heterologous DNA has been introduced are generated using colchicine or similar 
cell cycle arresting agents and various DNA probes (e.g. rDNA probe. Lambda 
DNA probe, selectable marker probe). The cells are analyzed for the presence 
of amplified regions of chromosomes, in particular amplification of the rDNA 

20 regions, and those cells exhibiting amplification are further cultured and 
analyzed for the formation of artificial chromosomes. 

The chromosomes of plant cells subjected to introduction of heterologous 
DNA and growth to generate artificial chromosomes can also be analyzed by 
scanning electron microscopy. Preparation of mitotic chromosomes for 

25 scanning electron microscopy can be performed using methods known in the 
art (see, e.g., Sumner (1991) Chromosome / 00:410-41 8). The chromosomes 
can be observed, for example, with a Hitachi S-800 field emission scanning 
electron microscope operated with an accelerating voltage of 25kV. 
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Example 14 

Detection of Amplification and Artificial Chromosome Formation by 
Idu Labeling of Chromosomes 

The structure of the chromosomes in plant cells can be analyzed by labeling 

5 the chromosomes with iododeoxyuridine (IdU), or other nucleotide analog, and 

using an IdU-specific antibody to visualize the chromosome structure. Plant cell 

cultures selected following introduction of heterologous DNA are labeled with 

IdU following standard protocols (Fujishige and Taniguchi (1998) Chromosome 

Research 6/61 1-619; Yanpaisan eta/. (1998) Biotechnology and Bioengineering, 

10 55:51 5-528; Trick and Bates (1996) Plant Cell Reports, 75:986-990; Binarova 
et al. (1993) Theoretical and Applied Genetics, 57:9-16; Wang et al. (1991) 
Journal of Plant Physiology, 738:200-203). Plant cells in culture, typically 
suspension culture, are used. A series of sub-cultures are initiated, and IdU 
labeling is performed as described above. Cells are allowed to incorporate IdU 

15 for up to a week, depending on the doubling time of the culture. Labeled 
chromosomes can be detected in plant cells (Fujishige and Taniguchi (1998) 
Chromosome Research 6:611-619; Binarova et aL (1993) Theoretical and 
Applied Genetics 57:9-16) and in mammalian cells (Gratzner and Leif (1981) 
Cytometry 7:385-393) using procedures well known in the art. IdU-labeled 

20 chromosomes are detected by immunocytochernical techniques. An anti-ldU 
fluorescein isothiocyanate (FITC)-conjugated B44 clone antibody (Becton 
Dickinson) is used to bind the IdU-DNA adduct in the DNA and is detected by 
fluorescence microscopy (490 nm excitation, 519 nm emission). Analysis of 
labeled chromosomes reveals the presence of amplified DNA regions and the 

25 formation of artificial chromosomes. 

Example 15 

Isolation of Metaphase Chromosomes from Protoplasts 

Artificial chromosomes, once detected in plant cells, may be isolated for 
transfer to other organisms and in particular other plant species. Several 
30 procedures may be used to isolate metaphase chromosomes from mitotic— 
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arrested plant cells, including, but not limited to, a polyamine-based buffer 
system (Cram era/. (1990) Methods in Cell Biology 33:377-3821), a modified 
hexylene glycol buffer system (Hadlaczky et aL (1982) Chromosoma 
56:643-65), a magnesium sulfate buffer system (Van den Engh et aL (1988) 
5 Cytometry S:266-270 and Van den Engh et al. (1984) Cytometry 5:108), an 
acetic acid fixation buffer system (Stoehr et aL (1982) Histochemistry 
7-4:57-61), and a technique utilizing hypotonic KCI and propidium iodide (Cram 
et al. (1994) XVII meeting of the International Society for Analytical Cytology, 
October 16-21 , Tutorial IV Chromosome Analysis and Sorting with Commerical 

10 Flow Cytometers; Cram et aL (1 990) Methods in Cell Biology 33:376; de Jong 
et al. (1999) Cytometry 35:129-133). 

In an exemplary procedure, a hexylene glycol buffer is used to isolate plant 
chromosomes from mitotic-arrested plant cells that have been converted to 
protoplasts (Hadlaczky etaf. (1982) Chromosoma 56:643-659). Chromosomes 

15 are isolated from about 10 6 mitotic cells re-suspended in a glycine-hexylene 
glycol buffer (100 mM glycine, 1 % hexylene glycol, pH 8.4-8.6, adjusted with 
a solution of saturated Ca(OH) 2 ) supplemented with 0.1% Triton X-100 (GHT 
buffer). The cells are incubated for 1 0 minutes at 37°C, and the chromosomes 
are purified by differential centrifugation to pellet the nuclei (200xg for 20 min) 

20 and sucrose gradient centrifugation (5-30% sucrose, 5600xg for 60 min, 
0-4°C). To avoid proteolytic degradation of chromosomal proteins, 1 mM PMSF 
(phenylmethylsulfonylfluoride) is used in the presence of 1 % isopropyl alcohol. 
The proteins can be extracted from the isolated chromosomes using dextran 
sulfate-heparin (DSH) extraction, and the chromosomes can be visualized via 

25 electron microscopy using techniques known in the art (Hadlaczky etal. (1 982) 
Chromosoma (Berl.J 56:643-659; Hadlaczky etal. (1981) Chromosoma (Berl.) 
67:537-555). Additionally, modifications of these procedures, including, but 
not limited to, modification of the buffer composition (Carrano et al. (1979) 
Proc. Natl. Acad. Sci. U.S.A. 76:1382-1384) and variation of the centrifugation 
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time or speed, to accommodate different plant species can be implemented by 
any skilled artisan. 

Example 16 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
5 Mammalian Artificial Chromosomes into a Dicot Plant: Arabidopsis 

One method of delivery of mammalian artificial chromosomes (MACs) into 

plant cells is the formation of microcells containing murine MACs and the 

CaP0 4 -rnediated uptake or the PEG-mediated fusion of these microcells with 

plant protoplasts. In this example, microcells and plant protoplasts, such as but 

10 not limited to tobacco and Arabidopsis protoplasts, were mixed (in a series of 
25:1, 10:1, 5:1, or 2:1 microcells:protoplasts ratio) and fusion was observed. 
Protocols for the formation of microcells are known in the art and are described, 
for example, in US Patent Nos. 5,240,840, 4,806,476 and 5,298,429 and in 
Fournier Proc. Natl. Acad. Sci. U.S.A. (1981) 75:6349-6353 and Lambert etal. 

15 Proc. Natl. Acad. Sci. U.S.A. (1991) 88: 5907-5912. The murine microcells 
can be labeled with Idu or the IVIACs stained with a specific dye such as, but 
not limited to, e.g., propidium iodide or DAPI, prior to fusion with plant 
protoplasts including, but not limited to, Arabidopsis and tobacco protoplasts, 
to facilitate detection of the presence of IVIACs in the protoplasts. 

20 In this example, MACs were introduced into Arabidopsis cells using 

microcell-PEG mediated fusion. Microcells were, formed from murine cells 
containing an artificial chromosome (see U.S. Patent No. 6,077,697) and were 
fused with freshly prepared Arabidopsis protoplasts in a ratio of 10:1, 
microcells to protoplasts. Fusion occurred in the presence of 25% PEG 6000, 

25 204 mM CaCI 2 , pH 6.9 within the first 5 minutes of mixing. Typically less than 
about one minute of mixing is required to observe fusion between microcells 
and protoplasts. Fused cells were washed with 240 mM CaCI 2 , then floated on 
top of a solution of 204mM sucrose in B5 salts. Cells were then transferred to 
cell suspension culture media (MS, 87mM sucrose, 2.7 pM napthalene acetic 

30 acid, 0.23 pM kinetin, pH 5.8). Empirical observations can be used to 
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determine the optimal concentration and composition of PEG and the 
concentration of calcium that provides the highest degree of fusion with the 
least toxicity. 

Fused protoplasts were allowed to grow for one or more generations. 
5 The presence of a mouse chromosomal sequence, including MACs, was 
demonstrated by southern hybridization with MAC probes, by FISH analysis and 
by PCR analysis using, for example, satellite sequences known to exist on the 
MAC chromosome. Thus, the mouse sequences were detected in the 
Arabidopsis protoplasts. 

1 0 To further demonstrate the transfer of mouse chromosomal sequence to 

Arabidopsis protoplasts, Arabidopsis plant cell nuclei were isolated according 
to Example 1 1 and were subjected to FISH analysis according to Example 1 3, 
using the mouse major satellite DNA (SEQ ID No. 12). A portion of the nuclei 
contained a significant signal using the mouse major satellite DNA, indicating 

15 successful transfer of at least a mouse chromosome and/or MAC to the 
Arabidopsis nuclei. 

Similarly, PACs may be introduced into Arabidopsis protoplasts using 
PEG- and/or calcium-mediated fusion procedures. Generation of 
microprotoplasts and protoplasts can be conducted as described, for example, 

20 in Example 1 . Microprotoplasts formed from plant cells containing a plant 
artificial chromosome are fused with freshly prepared Arabidopsis protoplasts, 
for example, in a ratio of 10:1, microprotoplasts to protoplasts. Protoplasts 
from other plants, including but not limited to, tobacco, wheat, maize and rice, 
can also be used as the recipient of MACs and/or PACs. Fused protoplasts are 

25 recovered and allowed to grow for one or more generations. The presence of 
the transferred PACs can be analyzed using methods such as, for example, 
those described herein (including Southern hybridization with PAC probes, FISH 
analysis and PCR analysis using DNA sequences specific to the PAC). 
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Example 17 

Transfer of Artificial Chromosomes into Plant Cells: Transfer of 
Mammalian Artificial Chromosomes into a Second Dicot Plant: Tobacco 

MACs were introduced into tobacco cells using microcell-PEG mediated 

5 fusion using the same microcells, MAC, and protocol as described in Example 

16. Microcells were formed from murine cells containing an artificial 

chromosome and were fused with freshly prepared tobacco BY-2 protoplasts in 

a ratio of 10:1, microcells to protoplasts. Fusion occurred in the presence of 

20% PEG 4000 and 1O0-2OOmM calcium chloride. Empirical observations are 

10 used to determine the optimal concentration and composition of PEG and the 

concentration of calcium that provides the highest degree of fusion with the 

least toxicity. 

DAPI staining of the microcells (e.g. by preincubation of the microcells 
with DAPI by adding DAPI to the microcells to a final concentration of 1 //g/ml) 

15 allowed visualization of the fusion and transfer of the chromosomes to the 
tobacco protoplasts. Fused protoplasts were recovered and allowed to grow for 
one or more generations. The fused protoplasts can be analyzed for the 
presence of a MAC in a number of ways, including those described herein. 
Fused tobacco cell nuclei were isolated from tobacco protoplasts that had been 

20 fused with microcells according to Example 1 1 and were subjected to FISH 
analysis according to Example 13, using the mouse major satellite DNA (SEQ 
ID No. 12). Numerous nuclei were found to have incorporated a mouse 
chromosome. 

Example 18 

25 Transfer of isolated Artificial Chromosomes by Lipid -Mediated Transfer 

into a Monocot Plant: Rice 

Isolated murine artificial chromosomes (MACs) prepared by sorting 

through a FACS apparatus (de Jong et al. Cytometry ( 1 999) 35: 1 29-1 33) were 

transferred into rice plant protoplasts by cationic lipid-mediated transfection of 

30 the purified MAC. Purified MACs (see Example 15 and U.S. Patent No. 

6,077,697) were mixed with LipofectAMINE 2000 (Gibco, Md, USA) as follows. 
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Typically, 1 5 jj\ of UpofectAMINE 20O0 were added to 1 X 10 6 artificial 
chromosomes in liquid buffer, the solution allowed to complex for up to three 
hours, and then the solution was added to freshly prepared 1 X 10 5 rice 
protoplasts prepared using standard protoplast methods well known in the art. 
5 The uptake of the lipid-complexed artificial chromosome was monitored by 
adding to the mixture of protoplasts and purified artificial chromosomes a 
fluorescent dye that stains DNA. Microscopic examination of the 
protoplast/artificial chromosome mixture over the next several hours allowed the 
visualization of the artificial chromosome being transported across the 

10 protoplast cellular membrane and the presence of the readily identifiable MAC 
in the cytoplasm of the rice plant cell. 

The same procedure as described in this Example for cationic lipid- 
mediated transfer of an isolated MAC into rice protoplasts can be used to 
transfer isolated MACs, as well as PACs, into rice and other plant protoplasts, 

15 including but not limited to, tobacco, wheat, maize and Arabidopsis. Fused 
protoplasts are recovered and allowed to grow for one or more generations. 
The presence of the transferred MACs and PACs can be analyzed using 
methods such as, for example, those described herein (including, but not limited 
to, Southern hybridization with PAC probes, FISH analysis and PCR analysis 

20 using DNA sequences specific to the PAC). 



Delivery of Plant Regulatory and Coding Sequences via a Promoterless attBZeo 
Marker Gene in pAg2 onto a MAC Platform 

As described in Examples 6-1 5, the plasmid pAg2, comprising plant 

25 regulatory and selectable marker genes (SEQ ID NO: 6; prepared as set forth in 

Example 5) can be used for the production of a MAC containing said plant 

expressible genes. In this example, pAg2, by virtue of the attBZeo DNA 

sequences contained on the plasmid, is used for the loading of plant regulatory 

and selectable marker genes onto MACs in mammalian cells using the attB 

30 sequences to recombine with attP sequences present on a platform MAC. In 
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this example, platform MACs are produced with attP sequences and the plasmid 
pAg2 is then loaded onto the platform MAC. New MACs so produced are 
useful for introduction into plan cells by virtue of the plant expressible markers 
contained therein. 

5 A. Construction of Platform MAC containing pSV40attPsensePUR (Figure 
7; SEQ ID NO: 26). 

An example of a selectable marker system for the creation of a MAC- 

based platform into which the plasmid pAg2 can target plant regulatory and 

coding sequences is shown in Figure 7. This system includes a vector 

10 containing the SV40 early promoter immediately followed by (1) a 282 base pair 

(bp) sequence containing the bacteriophage lambda attP site and (2) the 

puromycin resistance marker. Initially a Pvu\\fStu\ fragment containing the 

SV40 early promoter from plasmid pPUR (Clontech Laboratories, Inc., Palo Alto, 

CA; SEQ ID No. 22) was subcloned into the EcoRUCRX site of pNEB193 (a 

1 5 PUC1 9 derivative obtained from New England Biolabs, Beverly, MA; SEQ ID No. 

23) generating the plasmid pSV40193. 

The attP site was PCR amplified from lambda genome (GenBank 

Accession # NC 001416) using the following primers: 

attPUP: CCTTGCGCTAATGCTCTGTTACAGG SEQ ID No. 24 

20 attPDWN: CAGAGGCAGGGAGTGGGACAAAATTG SEQ ID No. 25 

After amplification and purification of the resulting fragment, the attP site 

was cloned into the Sma\ site of pSV40193 and the orientation of the attPsite 

was determined by DNA sequence analysis (plasmid pSV401 93attP). The gene 

encoding puromycin resistance (Puro) was isolated by digesting the plasmid 

25 pPUR (Clontech Laboratories, Inc. Palo Alto, CA) with AgeMBamHK followed by 

filling in the overhang's with Klenow and subsequently cloned into the>4scl site 

downstream of the attP site of pSV40193attP generating the plasmid 

pSV40193attPsensePUR (Figure 7; SEQ ID NO:26)). 

The plasmid pSV401 93attPsensePUR was digested with Seal and co- 

30 transfected with the plasmid pFK161 into mouse LMtk- cells and platform 
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artif icial chromosomes were identified and isolated as described herein. Briefly, 
Puromycin resistant colonies were isolated and subsequently tested for artificial 
chromosome formation via fluorescent in situ hybridization (FISH) (using mouse 
major and minor DNA repeat sequences, the puromycin gene and telomeres 
5 sequences as probes), and their fluorescent activating cell sorted (FACS). From 
this sort, a subclone was isolated containing an artificial chromosome, 
designated B19-38. FISH analysis of the B19-38 subclone demonstrated the 
presence of telomeres and mouse minor on the MAC. DOT PCR has been done 
revealing the absence of uncharacterized euchromatic regions on the MAC. The 

10 process for generating this exemplary MAC platform containing multiple site- 
specific recombination sites is summarized in Figure 5. This MAC chromosome 
may subsequently be engineered to contain target gene expression nucleic acids 
using the lambda integrase mediated site-specific recombination system as 
described below. 

15 B. Construction of Targeting Vector. 

The construction of the targeting vector pAg2 is set forth in Example 5 

herein. 

C. Transfection of Promotorless Marker and Selection With Drug (See 
Figure 9). 

20 The mouse LMtk- cell line containing the MAC B19-38 (constructed as 

set forth above and also referred to as a 2 nd generation platform ACE), is plated 
onto four 10cm dishes at approximately 5 million cells per dish. The cells are 
incubated overnight in DMEM with 10% fetal calf serum at 37°C and 5% C0 2 . 
The following day the cells are transfected with Sjjg of the vector pAg2 

25 (prepared as described in Example 5 above) and 5/vg of pCXLamlntR (encoding 
a lambda integrase having an E to R amino acid substitution at position 174), 
for a total of 10/vg per lOcm dish. Lipofectamine Plus reagent is used to 
transfect the cells according to the manufacturers protocol. Two days post- 
transfection zeocin is added to the medium at 500ug/ml. The cells are 
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maintatned in selective medium until colonies are formed. The colonies are then 

ring-cloned and genomic DNA is analyzed. 

D. Analysis Of Clones (PCR, SEQUENCING). 

Genomic DNA (including MACs) is isolated from each of the candidate 
5 clones with the Wizard kit (Promega) and following the manufacturers protocol. 

The following primer set is used to analyze the genomic DNA isolated from the 

zeocin resistant clones: 5PacSV40 - CTGTTAATTAACTGTGGAATGTGTG 

TCAGTTAGGGTG (SEQ ID NO: 28); Antisense Zeo - 

TGAACAGGGTCACGTCGTCC (SEQ ID NO: 29). PCR amplification using the 
10 above primers and genomic DNA, which included MACs, from the candidate 

clones results in a PCR product indicating the correct sequence for the desired 

site-specific integration event. 

The MACs containing the pAg2 vector are identified and used for transfer 

into plant (such as described in Examples 16 and 17) or animal cells for the 
15 expression of the desired coding sequences contained therein. The MACs 

containing pAg2 carry two plan selectable markers (hygromycin resistance, 

resistance to phosphinothricin) and a visual selectable marker (green fluorescent 

protein). 

Example 20 

20 Construction of Plant-derived Shuttle Artificial Chromosome. 

In another embodiment, the plant artificial chromosomes provided herein 
are useful as selectable shuttle vectors that are able to move one or more 
desired genes back and forth between plant and mammalian cells. In this 
particular embodiment, the plant artificial chromosome is bi-f unctional in that 
25 proper integration of donor nucleic acid can be selected for in both plant and 
mammalian cells. 

For example, a plant artificial chromosome is prepared as described in 
Examples 6-15 above using ing the plasmid pAg2 (Example 5; SEQ ID NO: 6) 
that has been modified to include the SV40attPsensePur coding region from the 
30 plasmid pSV401 93attPsensePur (described above in Example 1 9. A.). Thus, the 
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resulting plant-derived shuttle artificial chromosome contains DNA from the bar 
gene confering resistance to phosphinothricin in plant cells, DNA from the 
hygromycin resistance gene conferring resistance to hygromycin in plant cells, 
both resistance-encoding DNAs under the control of a separate cauliflower 
5 mosaic virus (CaMV) 35S promoter, the attB-prornoterless zeomycin resistance- 
encoding DNA, and DNA conferring resistance to puromycin under the control 
of a mammalian SV40 promoter. Accordingly, the presence of the shuttle PAC 
in either a plant or mammalian cell can be selected for by treatment with, for 
example, either hygromycin (plant) or puromycin (mammalian). 

10 Because the resulting plant-derived shuttle artificial chromosome contains 

at least one SV40attP site therein similar to the platform MAC prepared in 
Example 19. A. above, a donor vector containing an attB-selectable marker 
sequence, such as a plasmid comprising an attBzeo (e.g. pAg2) can be used to 
selectively introduce desired heterologous nucleic acids from any species (such 

15 as plants, animals, insects and the like) into the shuttle artificial chromosome 
that is present in a mammalian cell. 

Likewise, a plant promoter region, such as CaMV35S, can be used to 
replace the SV40 promoter in the SV40attPPur region of the modified pAg2 
plasmid described above. In this embodiment, because the resulting plant- 

20 derived shuttle artificial chromosome contains at least one CaMV35SattP site 
therein analogous to the platform MAC prepared in Example 19. A. above, a 
donor vector containing an attB-selectable marker sequence, such as a plasmid 
having attBkanamycin, or other plant selectable or scorable marker can- be used 
to selectively introduce desired heterologous nucleic acids from any species 

25 (such as plants, animals, insects and the like) into the shuttle artificial 
chromosome that is present in a plant cell. 

Since modifications will be apparent to those of skill in this art, it is 
intended that this invention be limited by only the scope of the appended 
claims. 
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What is Claimed: 

1. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 

5 selecting a cell comprising an artificial chromosome that comprises 

one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
10 sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1, wherein the artificial chromosome is 
predominantly made up of one or more repeat regions. 

15 3. The method of claim 1, wherein the nucleic acid introduced into 

the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or targets it to an amplifiable region of a plant 
chromosome. 

4. The method of claim 1 , wherein the nucleic acid introduced into 
20 the cell comprises one or more nucleic acids selected from the group consisting 

of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA. 

6. The method of claim 5, wherein the rDNA is from a plant selected 
25 from the group consisting of Arabidopsis, Nicotiana, Solarium, Lycopersicon, 

Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises animal 

rDNA. 

8. The method of claim 7, wherein the rDNA is mammalian rDNA. 
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9. The method of claim 4, wherein the nucleic acid comprises rDNA 
comprising sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 So/anum, Lycopersicon, Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1 . The method of claim 1 , wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of cells 
containing the nucleic acid. 

10 12. The method of claim 11, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

13. The method of claim 1 2, wherein the protein is a green fluorescent 
protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1 , wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ hybridization 
(FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1, wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabtdopsis, tobacco and Helianthus cells. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

20. A isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
5 euchromatic and heterochromatic nucleic acid. 

21 . The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
artificial chromosome is produced by the method of claim 1 or claim 2. 

lO 23. A method of producing a transgenic plant, comprising introducing 

the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

25. The method of claim 24, wherein the heterologous nucleic acid 
15 encodes a product selected from the group consisting of enzymes, antisense 

RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 

26. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product selected from the group consisting of vaccines, blood 

20 factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that provides for resistance to diseases, insects, herbicides 
or stress in the plant. 

28. The method of claim 24, wherein the heterologous nucleic acid 
25 encodes a product that provides for an agronomically important trait in the 

plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
nutrient quality of the plant. 
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30. The method of claim 24, wherein the heterologous nucieic acid is 
contained within a bacterial artificial chromosome (BAC) or a yeast artificial 
chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic DNA 
from a first species of plant; 

introducing the artificial chromosome into a plant cell of a second 
species of plant; and 

10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 

selecting a cell comprising a rninichromosome comprising a neo- 
centomere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, kanamycin, 
hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from the 
first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a method 

comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first plant 
species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA fromthe 
first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 

5 the artificial chromosome comprises a site-specific recombination sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence and 
the artificial chromosome comprises a site-specific recombination sequence that 
is complementary to the site-specific recombination sequence of the plant cell 

10 of a first plant species. 

44. The method of claim 39, wherein the site-specific recombination 
is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing a first nucleic acid comprising a site-specific 

recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 

introducing a recombinase activity into the plant cell, wherein the 
20 activity catalyzes recombination between the first and second chromosomes 
and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

47. The method of claim 45, wherein the second nucleic acid is 
25 introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and the 
second nucleic acid is introduced into the distal end of the arm of the second 
chromosome. 
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49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

1 5 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a plant 
cell, a recombination site and a recombinase coding region in operative linkage 
20 into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

51 . The method of any of claims 45-50, wherein the DNA of the short 
arm of the acrocentric chromosome contains less than 5% euchromatic DNA. 

52. The method of any of claims 45-50, wherein the DNA of the short 
30 arm of the acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54-. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 
5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising; 
introducing nucleic acid into a plant acrocentric chromosome in a 

10 cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome, is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

60. The method of claim 4, wherein the nucleic acid comprises plant 
30 rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant species. 
5 64. The method of claim 62, wherein the plant is a monocot plant 

species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1, wherein the cell is a monocot plant cell. 

67. An isolated plant artificial chromosome comprising one or more 
10 repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
15 represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that represent 
euchromatic and heterochromatic nucleic acid. 
25 69. The method of claim 44, wherein the recombinase is selected from 

the group consisting of a bacteriophage P1 Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

70. The method of claim 50, further comprising selecting first and 
second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71 . The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific recombination 

sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, wherein 

the chromosome contains adjacent regions of rDNA and heterochromatic DNA; 
culturing the cell through at (east one cell division; and 
25 selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
chromosome into which the nucleic acid is introduced is an acrocentric 

30 chromosome. 



WO 2002/096923 



PCT/US2002/017451 



-208- 

79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of any of claims 76-79, wherein the heterochromatic 
DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
wherein the agent is not toxic to plant cells; 
lO a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplifiable region of a plant 
chromosome. 

82. The vector of claim 81 , wherein the amplifiable region comprises 
15 heterochromatic nucleic acid. 

83. The vector of claim 81 , wherein the amplifiable region comprises 

rDNA. 

84. The vector of claim 81 , wherein the sequence of nucleotides that 
facilitates amplification of a region of a plant chromosome or targets the vector 

20 to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to facilitate amplification or effect the 
targeting. 

85. The vector of claim 84, wherein the sufficient portion contains at 
least 14, 20, 30, 50, 100, 150, 300 or 500 contiguous nucleotides from an 

25 intergenic spacer region. 

86. The vector of claim 81 , wherein the selectable marker encodes a 
product that confers resistance to zeomycin. 

88. The vector of claim 81 , wherein the recognition site comprises an 
att site. 

30 89. The vector claim 81, that is pAglla or pAgllb. 
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90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
animal cells in the presence of an agent normally toxic to the animal cells; and 
5 wherein the agent is not toxic to plant cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

91 . The vector of claim 90, wherein the recognition site comprises an 
att site. 

10 92. The vector of claim 90, further comprising a sequence of 

nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an amplifiable region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline synthase 
(NOS) or CaMV35S. 

15 94. The vector of claim 93 that is pAg1 or pAg 2. 

95. The vector of claim 92, wherein the amplifiable region comprises 
heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region comprises 

rDNA. 

20 97. The vector of claim 96, wherein the sequence of nucleotides that 

facilitates amplification of a region of a plant chromosome or targets the vector 
to an amplifiable region of a plant chromosome comprises a sufficient portion 
of an intergenic spacer region of rDNA to effect the amplification or the 
targeting. 

25 98. The vector of claim 90, wherein the protein is a selectable marker 

that permits growth of plant cells in the presence of an agent normally toxic to 
the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 
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100. The vector of claim 90, wherein the protein is a fluorescent 
protein. 

101. The vector of claim 90, wherein the fluorescent protein is selected 
from the group consisting of green, blue and red fluorescent proteins. 

5 102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth of 
plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 
lO a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
103. A vector, comprising: 

a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a region 
15 of a plant chromosome or targets the vector to an amplif "table region of a plant 
chromosome, wherein the plant is selected from the group consisting of 
Arabidopsis, Nicotiana, Sofanum, Lycopersicon, Daucus, Hordeum, Zea mays, 
Brassica, Triticum, Helianthus, Glycine, soybean, Gossypium, cotton, 
Helianthus, sunflower and Oryza. 
20 104. The vector of claim 103, wherein the recognition site comprises 

an att site. 

105. A cell, comprising a vector of any of claims 81-104. 

106. The cell of claim 105 that is a plant cell. 
25 107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site that 
recombines with the recognition site in the vector in the presences of the 
recombinase therefor, thereby incorporating the selectable marker that is not 
30 operably associated with any promoter and the nucleic acid encoding a protein 
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operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

1 08. The method of claim 1 07, wherein the recombination sites are att 

sites. 

5 109. The method of claim 107, wherein the animal is a mammal. 

1 lO. The method of claim 107 / wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable marker 
that in the vector is not operably associated with a promoter. 

111. The method of any of claims 107-1 10, further comprising, 
lO transferring the resulting platform ACes into a plant cell to produce a plant cell 

the compriese the platform Aces. 

112. The method of claim 111, wherein the resulting platform ACes is 
isolated prior to transfer. 

113. The method of claim 111, wherein the isolated ACes is introduced 
1 5 into a plant cell by a method selected from the group consisting of protoplast 

transfection, lipid-mediated delivery, liposomes, electroporation, sonoporation, 
microinjection, particle bombardment, silicon carbide whisker-mediated 
transformation, polyethylene glycol (PEG)-mediated DNA uptake, lipof ection and 
lipid-mediated carrier systems. 
20 114. The method of claim 111, wherein the resulting platform ACes is 

transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant protoplasts. 

116. The method of any of claim 107, wherein the cell is an animal 

117. The method of claim 116, wherein the animal cell is a mammalian 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 
encoded by the nucleic acid that is operably linked to a plant promoter is 

30 expressed. 



cell. 

25 

cell. 
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119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 

selecting a plant cell comprising an artificial chromosome that comprises 



5 one or more repeat regions. 

1 20. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

121. The method of claim 1 19 or claim 120, wherein: 

10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 1 22. The method of claim 119, further comprising isolating the artificial 

chromosome. 

123. A method, comprising: 

introducing a vector into a cell, wherein: 
i) the vector comprises: 
20 a) nucleic acid encoding a selectable marker that is 



not operably associated with any promoter, wherein the selectable 
marker permits growth of animal cells in the presence of an agent 
normally toxic to the animal ceils; and wherein the agent is not 
toxic to plant cells; 



25 



b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 



an animal promoter; 



ii) the cell comprises: 



30 



a platform plant artifical chromosome (PAC) that comprises 
a recombination site and an animal promoter that upon 
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recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a promoter; 

iii) introduction is effected under conditions whereby the 
vector recombines with the PAC to produce a plant platform PAC that contains 
5 the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein encoded 
by nucleic acid operably linked to an animal promoter is expressed. 

124. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

lO 125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises nucleic acid encoding a selectable marker. 

~ 1 27. The vector of claim 81 , further comprising one or more selectable 
15 markers that when expressed in the plant cell permit the selection of the cell. 
128. A plant transformation vector, comprising: 
a recognition site for recombination; 

a sequence of nucleotides that facilitates amplification of a region 
of a plant chromosome or targets the vector to an amplif iable region of a plant 
20 chromosome; and 

one or more selectable markers that when expressed in a plant cell 
permit the selection of the cell; wherein 

the plant transformation vector is for Agrotoacterium-medlated 
transformation of plants. 
25 1 29. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81, 127 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
30 one or more nucleic acid units is (are) repeated in a repeat region; 
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repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 
5 1 30. A method of producing a plant artificial chromosome, comprising: 

introducing the vector of any of claims 81 , 127 and 1 28 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that comprises 
one or more repeat regions; wherein 
10 one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
15 131. The method of claim 123, wherein the cell into which the vector 

is introduced is an animal cell. 

132. The method of claim 131, wherein the cell is a mammalian cell. 
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AMENDED CLAIMS 

[received by the International Bureau on 24 December 2002 (24.12.02); 
original claims 3, 9, 16, 20, 35, 52, 56, 80, 101, 105, 107, 111, 116, 123 and 128-132 amended; 

remaining claims unchanged (17 pages)] 

What is Claimed: 

1 . A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a cell comprising one or more plant 

chromosomes; and 
5 selecting a cell comprising an artificial chromosome that 

comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat 

region; 

lO repeats of a nucleic acid unit have common nucleic acid 

sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

2. The method of claim 1, wherein the artificial chromosome is 
15 predominantly made up of one or more repeat regions. 

3. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates amplification of a 
region of a plant chromosome or that targets the nucleic acid to an 
amplifiable region of a plant chromosome. 

20 4. The method of claim 1 , wherein the nucleic acid introduced into 

the cell comprises one or more nucleic acids selected from the group 
consisting of rDNA, lambda phage DNA and satellite DNA. 

5. The method of claim 4, wherein the nucleic acid comprises 
plant rDNA. 

25 6. The method of claim 5, wherein the rDNA is from a plant 

selected from the group consisting of Arabidopsis, Nicotiana, Solatium, 
Lycopersicon, Daucus, Hordeum, Zea mays, Brassica, Triticum and Oryza. 

7. The method of claim 4, wherein the nucleic acid comprises 
animal rDNA. 

30 8. The method of claim 7, wherein the rDNA is mammalian rDNA. 



AMENDED SHEET (ARTICLE 12) 



WO 2002/096923 



216 



PCT/US2002/017451 



9. The method of claim 4, wherein the nucleic acid comprises 
rDNA comprising a sequence of an intergenic spacer region. 

10. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a plant selected from the group consisting of Arabidopsis, 

5 Solarium, Lycopersicon , Hordeum, Zea, Oryza, rye, wheat, radish and mung 
bean. 

1 1. The method of claim 1, wherein the nucleic acid introduced into 
the cell comprises a nucleic acid sequence that facilitates identification of 
cells containing the nucleic acid. 
10 12. The method of claim 1 1, wherein the nucleic acid sequence 

encodes a fluorescent protein. 

13. The method of claim 12, wherein the protein is a green 
fluorescent protein. 

14. The method of claim 1, wherein the step of selecting a cell 
15 comprising an artificial chromosome comprises sorting of cells into which 

nucleic acid was introduced. 

15. The method of claim 1 , wherein the step of selecting a cell 
comprising an artificial chromosome comprises fluorescent in situ 
hybridization (FISH) analysis of cells into which nucleic acid was introduced. 

20 16. The method of claim 1 , wherein the one or more plant 

chromosomes contained in the cell is (are) selected from the group consisting 
of Arabidopsis, tobacco and Helianthus chromosomes. 

17. The method of claim 16, wherein the cell is a plant protoplast. 

18. The method of claim 1, wherein the nucleic acid introduced into 
25 the cell comprises nucleic acid encoding a selectable marker. 

19. The method of claim 18, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

20. An isolated plant artificial chromosome comprising one or more 
30 repeat regions, wherein: 
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one or more nucleic acid units is (are) repeated in a repeat 



region; 



repeats of a nucleic acid unit have common nucleic acid 



sequences; and 



5 



the repeat region(s) contain substantially equivalent amounts of 



euchromatic and heterochromatic nucleic acid. 

21. The plant artificial chromosome of claim 20, wherein the artificial 
chromosome is predominantly made up of one or more repeat regions. 

22. A plant cell comprising an artificial chromosome, wherein the 
10 artificial chromosome is produced by the method of claim 1 or claim 2. 

23. A method of producing a transgenic plant, comprising 
introducing the artificial chromosome of claim 20 or claim 21 into a plant cell. 

24. The method of claim 23, wherein the artificial chromosome 
comprises heterologous nucleic acid encoding a gene product. 

15 25. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product selected from the group consisting of enzymes, antisense 
RNA, tRNA, rDNA, structural proteins, marker proteins, ligands, receptors, 
ribozymes, therapeutic proteins and biopharmaceutical proteins. 



20 encodes a product selected from the group consisting of vaccines, blood 

factors, antigens, hormones, cytokines, growth factors and antibodies. 

27. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product that provides for resistance to diseases, insects, herbicides 

or stress in the plant. 
25 28. The method of claim 24, wherein the heterologous nucleic acid 

encodes a product that provides for an agronomically important trait in the 

plant. 

29. The method of claim 24, wherein the heterologous nucleic acid 
encodes a product that alters the nutrient utilization and/or improves the 
30 nutrient quality of the plant. 



26. 



The method of claim 24, wherein the heterologous nucleic acid 
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30. The method of claim 24, wherein the heterologous nucleic acid 
is contained within a bacterial artificial chromosome (BAC) or a yeast 
artificial chromosome (YAC). 

31. A method of identifying plant genes encoding particular traits, 
5 comprising: 

generating an artificial chromosome comprising euchromatic 
DNA from a first species of plant; 

introducing the artificial chromosome into a plant cell of a 
second species of plant; and 
10 detecting phenotypic changes in the plant cell comprising the 

artificial chromosome and/or a plant generated from the plant cell comprising 
the artificial chromosome. 

32. The method of claim 31, wherein the artificial chromosome is a 
plant artificial chromosome or a mammalian artificial chromosome. 

15 33. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a cell comprising one or more plant 
chromosomes; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 
25 34. The method of claim 31, wherein the artificial chromosome is 

produced by a method comprising: 

introducing nucleic acid into a plant cell; and 
selecting a plant cell comprising a SATAC. 
35. The method of claim 31, wherein the artificial chromosome is a 
30 minichromosome produced by a method comprising: 
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introducing nucleic acid into a plant cell; and 
selecting a cell comprising a minichromosome comprising a 
neo-centromere and euchromatin. 

36. The method of any of claims 33-35, wherein the nucleic acid 
5 introduced into the plant cell comprises DNA encoding a selectable marker. 

37. The method of claim 36, wherein the selectable marker confers 
resistance to phosphinothricin, ammonium glufosinate, glyphosate, 
kanamycin, hygromycin, dihydrofolate or sulfonylurea. 

38. The method of claim 31, wherein the artificial chromosome 
10 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first plant species an artificial 
chromosome capable of undergoing homologous recombination with the DNA 
of the first plant species; 
15 selecting for a recombination event between the artificial chromosome 

and the DNA of the first plant species; and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

39. The method of claim 31, wherein the artificial chromosome 
20 comprising euchromatic DNA from a first plant species is produced by a 

method comprising: 

introducing into a plant cell of a first species an artificial chromosome 
capable of undergoing site-specific recombination with the DNA of the first 
plant species; 

25 selecting for a site-specific recombination event between the artificial 

chromosome and the DNA of the first plant species, and 

selecting an artificial chromosome comprising euchromatic DNA from 
the first plant species. 

40. The method of claim 39, wherein the DNA of the plant cell of a 
30 first species is modified to comprise a site-specific recombination sequence. 
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41. The method of claim 39, wherein the artificial chromosome 
comprises a site-specific recombination sequence. 

42. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 

5 and the artificial chromosome comprises a site-specific recombination 
sequence. 

43. The method of claim 39, wherein the DNA of the plant cell of a 
first species is modified to comprise a site-specific recombination sequence 
and the artificial chromosome comprises a site-specific recombination 

10 sequence that is complementary to the site-specific recombination sequence 
of the plant cell of a first plant species. 

44. The method of claim 39, wherein the site-specific 
recombination is catalyzed by a recombinase enzyme. 

45. A method for producing an acrocentric plant chromosome, 
15 comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into a first chromosome of a plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into a second chromosome of the plant cell; 
20 introducing a recombinase activity into the plant cell, wherein 

the activity catalyzes recombination between the first and second 
chromosomes and whereby an acrocentric plant chromosome is produced. 

46. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome. 

25 47. The method of claim 45, wherein the second nucleic acid is 

introduced into the distal end of the arm of the second chromosome. 

48. The method of claim 45, wherein the first nucleic acid is 
introduced into the pericentric heterochromatin of the first chromosome and 
the second nucleic acid is introduced into the distal end of the arm of the 

30 second chromosome. 



AMENDED SHEET (ARTICLE 19) 



WO 2002/096923 PCTAJS2002/0 17451 

221 



49. A method for producing an acrocentric plant chromosome, 
comprising: 

introducing a first nucleic acid comprising a site-specific 
recombination site into the pericentric heterochromatin of a chromosome in a 
5 plant cell; 

introducing a second nucleic acid comprising a site-specific 
recombination site into the distal end of the chromosome, wherein the first and 
second recombination sites are located on the same arm of the chromosome; 

introducing a recombinase activity into the cell, wherein the 
10 activity catalyzes recombination between the first and second recombination 
sites in the chromosome and whereby an acrocentric plant chromosome is 
produced. 

50. A method for producing an acrocentric plant chromosome, 
comprising: 

15 introducing nucleic acid comprising a recombination site adjacent 

to nucleic acid encoding a selectable marker into a first plant cell; 

generating a first transgenic plant from the first plant cell; 
introducing nucleic acid comprising a promoter functional in a 
plant cell, a recombination site and a recombinase coding region in operative 
20 linkage into a second plant cell; 

generating a second transgenic plant from the second plant cell; 
crossing the first and second plants; 

obtaining plants resistant to an agent that selects for cells 
containing the nucleic acid encoding the selectable marker; and 
25 selecting a resistant plant that contains cells comprising an 

acrocentric plant chromosome. 

51 . The method of any of claims 45-50, wherein the DNA of the 
short arm of the acrocentric chromosome contains less than 5% euchromatic 
DNA. 

30 52. The method of claim 51 , wherein the DNA of the short arm of the 

acrocentric chromosome contains less than 1 % euchromatic DNA. 
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53. The method of any of claims 45-50, wherein the short arm of the 
acrocentric chromosome does not contain euchromatic DNA. 

54. The method of any of claims 45-49, wherein the nucleic acid 
introduced into a chromosome comprises nucleic acid encoding a selectable 

5 marker. 

55. An acrocentric plant artificial chromosome, wherein the short arm 
of the acrocentric chromosome does not contain euchromatic DNA. 

56. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant acrocentric chromosome in a 

TO cell, wherein the short arm of the acrocentric chromosome does not contain 
euchromatic DNA; 

culturing the cell through at least one cell division; and 
selecting a cell comprising an artificial chromosome that is 
predominantly heterochromatic. 
15 57. The method of claim 56, wherein the acrocentric chromosome is 

produced by the method of any of claims 45-49. 

58. A method for producing an artificial chromosome, comprising: 
introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
20 comprises one or more repeat regions 
wherein: 

one or more nucleic acid units is (are) repeated in a repeat region; 

repeats of a nucleic acid unit have common nucleic acid 
25 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

59. The method of claim 4, wherein the nucleic acid comprises plant 
rDNA from a dicot plant species. 

30 60. The method of claim 4, wherein the nucleic acid comprises plant 

rDNA from a monocot plant species. 
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61. The method of claim 9, wherein the intergenic spacer region is 
from DNA from a Nicotiana plant. 

62. The method of claim 9, wherein the rDNA is plant rDNA. 

63. The method of claim 62, wherein the plant is a dicot plant 
5 species. 

64. The method of claim 62, wherein the plant is a monocot plant 
species. 

65. The method of claim 1, wherein the cell is a dicot plant cell. 

66. The method of claim 1, wherein the cell is a monocot plant cell. 
10 67. An isolated plant artificial chromosome comprising one or more 

repeat regions, wherein: 

one or more nucleic acid units is (are) repeated in a repeat region 

repeats of a nucleic acid unit have common nucleic acid 
1 5 sequences; and 

the common nucleic acid sequences comprise sequences that 
represent euchromatic and heterochromatic nucleic acid. 

68. The method of claim 31, wherein the artificial chromosome is 
produced by a method comprising: 

20 introducing nucleic acid into a plant cell; and 

selecting a plant cell comprising an artificial chromosome that 
comprises one or more repeat regions, wherein: 

repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

25 the common nucleic acid sequences comprise sequences that represent 

euchromatic and heterochromatic nucleic acid. 

69. The method of claim 44, wherein the recombinase is selected 
from the group consisting of a bacteriophage PI Cre recombinase, a yeast R 
recombinase and a yeast FLP recombinase. 

30 70. The method of claim 50, further comprising selecting first and 

second transgenic plants wherein: 
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one of the plants comprises a chromosome comprising a 
recombination site located on a short arm of the chromosome in a region 
adjacent to the pericentric heterochromatin; and 

the other plant comprises a chromosome comprising a 
5 recombination site located in rDNA of the chromosome. 

71. The method of claim 70, wherein the recombination sites on the 
two chromosomes are in the same orientation. 

72. A method for producing an acrocentric plant chromosome, 
comprising: 

10 introducing nucleic acid comprising two site-specific 

recombination sites into a cell comprising one or more plant chromosomes; 

introducing a recombinase activity into the cell, wherein the 

activity catalyzes recombination between the two recombination sites, whereby 

a plant acrocentric chromosome is produced. 
15 73. The method of claim 72, wherein the two site-specific 

recombination sites are contained on separate nucleic acid fragments. 

74. The method of claim 73, wherein the separate nucleic acid 
fragments are introduced into the cell simultaneously or sequentially. 

75. The method of claim 56, wherein the artificial chromosome is 
20 predominantly heterochromatic. 

76. A method of producing a plant artificial chromosome, comprising: 
introducing nucleic acid into a plant chromosome in a cell, 

wherein the chromosome contains adjacent regions of rDNA and 
heterochromatic DNA; 
25 culturtng the cell through at least one cell division; and 

selecting a cell comprising an artificial chromosome. 

77. The method of claim 76, wherein the artificial chromosome is 
predominantly heterochromatic. 

78. The method of claim 76 or claim 77, wherein the plant 
30 chromosome into which the nucleic acid is introduced is an acrocentric 

chromosome. 
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79. The method of claim 78, wherein the short arm of the 
chromosome contains adjacent regions of rDNA and heterochromatic DNA. 

80. The method of claim 76, 77, or 79, wherein the 
heterochromatic DNA is pericentric heterochromatin. 

5 81. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
10 a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 
region of a plant chromosome or targets the vector to an amplifiabte region 
of a plant chromosome. 

82. The vector of claim 81, wherein the arnplifiable region 
1 5 comprises heterochromatic nucleic acid. 

83. The vector of claim 81, wherein the arnplifiable region 
comprises rDNA. 

84. The vector of claim 81, wherein the sequence of nucleotides 
that facilitates amplification of a region of a plant chromosome or targets the 

20 vector to an arnplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to facilitate amplification or 
effect the targeting. 

85. The vector of claim 84, wherein the sufficient portion contains 
at least 14, 20, 30, 50, 100, 1 50, 300 or 500 contiguous nucleotides from 

25 an intergenic spacer region. 

86. The vector of claim 81, wherein the selectable marker encodes 
a product that confers resistance to zeomycin. 

87. A plant transformation vector, comprising: 
a recognition site for recombination; 

30 a sequence of nucleotides that facilitates amplification of a 
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region of a plant chromosome or targets the vector to an amplif iable region 
of a plant chromosome; and 

one or more selectable markers that when expressed in a plant 
cell permit the selection of the cell; wherein 
5 the plant transformation vector is for Agrobacterium-med'iated 

transformation of plants. 

88. The vector of claim 81, wherein the recognition site comprises 
an att site. 

89. The vector claim 81, that is pAglla or pAgllb. 
10 90. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
associated with any promoter, wherein the selectable marker permits. growth 
of animal cells in the presence of an agent normally toxic to the animal cells; 
and wherein the agent is not toxic to plant cells; 
15 a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 

91. The vector of claim 90, wherein the recognition site comprises 
an att site. 

92. The vector of claim 90, further comprising a sequence of 

20 nucleotides that facilitates amplification of a region of a plant chromosome or 
targets the vector to an ampiifiabte region of a plant chromosome. 

93. The vector of claim 90, wherein the promoter is nopaline 
synthase (NOS) or CaMV35S. 

94. The vector of claim 93 that is pAgl or pAg 2. 

25 95. The vector of claim 92, wherein the ampiifiabte region 

comprises heterochromatic nucleic acid. 

96. The vector of claim 92, wherein the amplifiable region 
comprises rDNA. 

97. The vector of claim 96, wherein the sequence of nucleotides 
30 that facilitates amplification of a region of a plant chromosome or targets the 
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vector to an amplifiable region of a plant chromosome comprises a sufficient 
portion of an intergenic spacer region of rDNA to effect the amplification or 
the targeting. 

98. The vector of claim 90, wherein the protein is a selectable 
5 marker that permits growth of plant cells in the presence of an agent 

normally toxic to the plant cells. 

99. The vector of claim 98, wherein the selectable marker confers 
resistance to hygromycin or to phosphothricin. 

100. The vector of claim 90, wherein the protein is a fluorescent 
10 protein. 

101. The vector of claim 100, wherein the fluorescent protein is 
selected from the group consisting of green, blue and red fluorescent proteins. 

102. A vector, comprising: 

nucleic acid encoding a selectable marker that is not operably 
15 associated with any promoter, wherein the selectable marker permits growth 
of plant cells in the presence of an agent normally toxic to the plant cells; and 
wherein the agent is not toxic to animal cells; 

a recognition site for recombination; and 

nucleic acid encoding a protein operably linked to a plant promoter. 
20 103. A vector, comprising: 



region of a plant chromosome or targets the vector to an amplifiable region of 
a plant chromosome, wherein the plant is selected from the group consisting 
25 of Arabidopsis, Nicotiana, Solanum, Lycopers/con, Daucus, Hordeum, Zea 
mays, Brassica, Triticum, Hefianthus, Glycine, soybean, Gossypium, cotton, 
Helianthus, sunflower and Oryza. 

104. The vector of claim 103, wherein the recognition site comprises 
an att site. 

30 105. A cell, comprising a vector of any of claims 81-86 and 88-104. 



a recognition site for recombination; and 

a sequence of nucleotides that facilitates amplification of a 



106. 



The cell of claim 105 that is a plant cell. 
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107. A method, comprising: 

introducing a vector of claim 90 into a cell, wherein: 
the cell comprises an animal platform ACes that contains a recognition site 
that recombines with the recognition site in the vector in the presence of the 
5 recombinase therefor, thereby incorporating the selectable marker that is not 
operably associated with any promoter and the nucleic acid encoding a protein 
operably linked to a plant promoter into the platform ACes to produce a 
resulting platform ACes. 

108. The method of claim 107, wherein the recombination sites are 
lO att sites. 

109. The method of claim 107, wherein the animal is a mammal. 

110. The method of claim 107, wherein the platform ACes comprises 
a promoter that upon recombination is operably linked to the selectable 
marker that in the vector is not operably associated with a promoter. 

15 111. The method of any of claims 1 07-1 1 0, further comprising, 

transferring the resulting platform ACes into a plant cell to produce a plant 
cell that comprises the platform Aces. 

112. The method of claim 111, wherein the resulting platform ACes 
is isolated prior to transfer. 

20 113. The method of claim 111, wherein the isolated ACes is 

introduced into a plant cell by a method selected from the group consisting of 
protoplast transfection, lipid-mediated delivery, liposomes, electroporation, 
sonoporation, microinjection, particle bombardment, silicon carbide whisker- 
mediated transformation, polyethylene glycol (PEG)-mediated DNA uptake, 

25 lipofection and lipid-mediated carrier systems. 

114. The method of claim 111, wherein the resulting platform ACes 
is transferred by fusion of the cells. 

115. The method of claim 111, wherein the cells are plant 
protoplasts. 

30 116. The method of claim 107, wherein the cell is an animal cell. 
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117. The method of claim 116, wherein the animal cell is a 
mammalian cell. 

118. The method of claim 111, further comprising culturing the plant 
cell that comprises the platform Aces under conditions whereby the protein 

5 encoded by the nucleic acid that is operably linked to a plant promoter is 
expressed. 

119. A method, comprising: 

introducing a vector of claim 81 into a plant cell; 
culturing the plant cells; and 
lO selecting a plant cell comprising an artificial chromosome that comprises 

one or more repeat regions. 

120. The method of claim 119, wherein sufficient portion of the vector 
integrates into a chromosome in the plant cell to result in amplification of 
chromosomal DNA. 

15 121. The method of claim 119 or claim 120, wherein: 



one or more nucleic acid units is (are) repeated in a repeat region; 



20 



repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
euchromatic and heterochromatic nucleic acid. 

122. The method of claim 119, further comprising isolating the 



artificial chromosome. 



123. 



A method, comprising: 

introducing a vector into a cell, wherein: 



25 



i) the vector comprises: 



a) nucleic acid encoding a selectable marker that is 



30 



not operably associated with any promoter, wherein the 
selectable marker permits growth of animal cells in the presence 
of an agent normally toxic to the animal cells; and wherein the 
agent is not toxic to plant cells; 
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b) a recognition site for recombination; and 

c) nucleic acid encoding a protein operably linked to 
an animal promoter; 

ii) the cell comprises: 
5 a platform plant artificial chromosome (PAC) that 

comprises a recombination site and an animal promoter that upon 
recombination is operably linked to the selectable marker that, in 
the vector, is not operably associated with a 

promoter; 

10 iit) introduction is effected under conditions whereby 

the vector recombines with the PAC to produce a plant platform PAC that 
contains the selectable marker operably linked to the promoter; and 

culturing the resulting cell under conditions, whereby the protein 
encoded by nucleic acid operably linked to an animal promoter is expressed. 

15 124. The method of claim 119, wherein the artificial chromosome is an 

ACes. 

125. The method of claim 123, wherein the plant platform PAC is an 

ACes. 

126. The method of claim 1, wherein the nucleic acid introduced into 
20 the cell comprises nucleic acid encoding a selectable marker. 

127. The vector of claim 81, further comprising one or more selectable 
markers that when expressed in the plant cell permit the selection of the cell. 

128. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 

25 comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid sequences; and 
30 the common nucleic acid sequences comprise sequences that 

represent euchromatic and heterochromatic nucleic acid. 
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129. A method of producing a plant artificial chromosome, comprising: 
introducing the vector of claim 81, 87 or 127 into a cell 
comprising one or more plant chromosomes; and 

selecting a cell comprising an artificial chromosome that 
5 comprises one or more repeat regions; wherein 

one or more nucleic acid units is (are) repeated in a repeat region; 
repeats of a nucleic acid unit have common nucleic acid 
sequences; and 

the repeat region(s) contain substantially equivalent amounts of 
lO euchromatic and heterochromatic nucleic acid. 

1 30. The method of claim 1 23, wherein the cell into which the vector 
is introduced is an animal cell. 

131. The method of claim 1 30, wherein the cell is a mammalian cell. 

132. The method of claim 78, wherein the heterochromatic DNA is 
15 pericentric heterochromatin. 
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SEQUENCE LISTING 

<110> CHROMOS MOLECULAR SYSTEMS, INC. 
Perez, Carl 
Fabijanski, Steven 
Perkins, Edward 

<12 0> Plant Artificial Chromosomes, Uses thereof, and Methods of Preparing 
Plant Artificial Chromosomes 

<130> 24601-419PC 

<140> Not Yet Assigned 
<141> Herewith 

<150> US 60/294,687 
<151> 2001-05-30 

<150> US 60/296,329 
<151> 2001-06-04 

<160> 51 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
<211> 11182 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAgl plasmid 
<400> 1 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 

atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 120 

agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 

gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 240 

agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 3 00 

ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 

ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 

acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 4 80 
ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 54 0 

acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 600 

agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 660 

tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 

tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 780 

ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 84 0 

gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 

gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 

cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 102 0 

ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 1080 

gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gatgtgtatt 114 0 

tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 12 0 0 

aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 1260 

aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 132 0 

ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 

ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 144 0 

cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 

atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 

accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 

gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 1680 

gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 1740 

ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 1800 

cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 

aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 192 0 

gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 198 0 
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agttgccggc ggaggatcac accaagctga 
ttaccgagct gctatctgaa tacatcgcgc 
atgagtagat gaattttagc ggctaaagga 
accgacgccg tggaatgccc catgtgtgga 
tgggttgtct gccggccctg caatggcact 
cggtcgcaaa ccatccggcc cggtacaaat 
gaagttgaag gccgcgcagg ccgcccagcg 
tgaatcgtgg caagcggccg ctgatcgaat 
cggtgcgccg tcgattagga agccgcccaa 
gatgctctat gacgtgggca cccgcgatag 
tctgtcgaag cgtgaccgac gagctggcga 
cgtagaggfct tccgcagggc cggccggcat 
gatggcggtt tcccatctaa ccgaatccat 
gcccggccgc gtgttccgtc cacacgttgc 
tggcggaaag cagaaagacg acctggtaga 
tgccatgcag cgtacgaaga aggccaagaa 
agccttgatt agccgctaca agatcgtaaa 
gatcgagcta gctgattgga tgtaccgcga 
gacggttcac cccgafcfcact ttttgatcga 
ggcacgccgc gccgcaggca aggcagaagc 
cagtggcagc gccggagagt tcaagaagtt 
aaatgacctg ccggagtacg atttgaagga 
catgcgctac cgcaacctga tcgagggcga 
gatgctaggg caaattgccc tagcagggga 
tagcacgtac attgggaacc caaagccgta 
cccaaagccg tacattggga accggtcaca 
a 99 c 9 attt t tccgcctaaa actctttaaa 
ctgtgcataa ctgtctggcc agcgcacagc 
gtcgctgcgc tccctacgcc ccgccgcttc 
aaaaatggct ggcctacggc caggcaatct 
actcgaccgc cggcgcccac atcaaggcac 
aaaacctctg acacatgcag ctcccggaga 
ggagcagaca agcccgtcag ggcgcgtcag 
tgacccagtc acgtagcgat agcggagtgt 
gattgtactg agagtgcacc atatgcggtg 
ataccgcatc aggcgctctt ccgcttcctc 
gctgcggcga gcggtatcag ctcactcaaa 
ggataacgca ggaaagaaca tgtgagcaaa 
ggccgcgttg ctggcgtttt tccataggct 
acgctcaagt cagaggtggc gaaacccgac 
tggaagctcc ctcgtgcgct ctcctgttcc 
ctttctccct tcgggaagcg tggcgctttc 
ggtgtaggtc gttcgctcca agctgggctg 
ctgcgcctta tccggtaact atcgtcttga 
actggcagca gccactggta acaggattag 
gttcttgaag tggtggccta actacggcta 
tctgctgaag ccagttacct tcggaaaaag 
caccgctggt agcggtggtt tttttgtttg 
atctcaagaa gatcctttga tcttttctac 
acgttaaggg attttggtca tgcattctag 
atattttatt ttctcccaat caggcttgat 
ctgttcttcc ccgatatcct ccctgatcga 
gtccgccctg ccgcttctcc caagatcaat 
gatgttgctg tctcccaggt cgccgtggga 
ctttaaaaaa tcatacagct cgcgcggatc 
gcaatccaca tcggccagat cgttattcag 
taagctattc gtatagggac aatccgatat 
cgcatacagc tcgataatct tttcagggct 
gacgccatcg gcctcactca tgagcagatt 
gacctttgga acaggcagct ttccttccag 
atcataggtg gtccctttat accggctgtc 
tcccaccagc ttatatacct tagcaggaga 
tttttcgatc agttttttca attccggtga 
tcctcttttc tacagtattt aaagataccc 
aattcactgt tccttgcatt ctaaaacctt 
ttttcaaagt tggcgtataa catagtatcg 
caggcagcaa cgctctgtca tcgttacaat 



-2- 



agatgtacgc ggtacgccaa ggcaagacca 204 0 
agctaccaga gtaaatgagc aaatgaataa 2100 
ggcggcatgg aaaatcaaga acaaccaggc 216 0 
ggaacgggcg gttggccagg cgtaagcggc 222 0 
ggaaccccca agcccgagga atcggcgtga 2280 
cggcgcggcg ctgggtgatg acctggtgga 234 0 
gcaacgcatc gaggcagaag cacgccccgg 24 0 0 
ccgcaaagaa tcccggcaac cgccggcagc 24 6 0 
gggcgacgag caaccagatt ttttcgttcc 252 0 
tcgcagcatc atggacgtgg ccgttttccg 2580 
ggtgatccgc tacgagcttc cagacgggca 264 0 
ggccagtgtg tgggattacg acctggtact 270 0 
gaaccgatac cgggaaggga agggagacaa 27 60 
ggacgtactc aagttctgcc ggcgagccga 282 0 
aacctgcatt cggttaaaca ccacgcacgt 2880 
cggccgcctg gtgacggtat ccgagggtga 2 94 0 
gagcgaaacc gggcggccgg agtacatcga 300 0 
gatcacagaa ggcaagaacc cggacgtgct 3060 
tcccggcatc ggccgttttc tctaccgcct 312 0 
cagatggttg ttcaagacga tctacgaacg 3180 
ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
ggaggcgggg caggctggcc cgatcctagt 33 00 
agcatccgcc ggttcctaat gtacggagca 3360 
aaaaggtcga aaaggtctct ttcctgtgga 342 0 
cattgggaac cggaacccgt acattgggaa 34 80 
catgtaagtg actgatataa aagagaaaaa 354 0 
acttattaaa actcttaaaa cccgcctggc 3600 
cgaagagctg caaaaagcgc ctacccttcg 3660 
gcgtcggcct atcgcggccg ctggccgctc 372 0 
accagggcgc ggacaagccg cgccgtcgcc 3780 
cctgcctcgc gcgtttcggt gatgacggtg 384 0 
cggtcacagc ttgtctgtaa gcggatgccg 3900 
cgggtgttgg cgggtgtcgg ggcgcagcca 396 0 
atactggctt aactatgcgg catcagagca 4 02 0 
tgaaataccg cacagatgcg taaggagaaa 4080 
gctcactgac tcgctgcgct cggtcgttcg 414 0 
ggcggtaata cggttatcca cagaatcagg 42 00 
aggccagcaa aaggccagga accgtaaaaa 42 60 
ccgcccccct gacgagcatc acaaaaatcg 4320 
aggactataa agataccagg cgtttccccc 4380 
gaccctgccg cttaccggat acctgtccgc 4440 
tcatagctca cgctgtaggt atctcagttc 4500 
tgtgcacgaa ccccccgttc agcccgaccg 4560 
gtccaacccg gtaagacacg acttatcgcc 4620 
cagagcgagg tatgtaggcg gtgctacaga 4680 
cactagaagg acagtatttg gtatctgcgc 4740 
agttggtagc tcttgatccg gcaaacaaac 4800 
caagcagcag attacgcgca gaaaaaaagg 4 86 0 
ggggtctgac gctcagtgga acgaaaactc 4 92 0 
gtactaaaac aattcatcca gtaaaatata 4980 
ccccagtaag tcaaaaaata gctcgacata 504 0 
ccggacgcag aaggcaatgt cataccactt 510 0 
aaagccactt actttgccat ctttcacaaa 5160 
aaagacaagt tcctcttcgg gcttttccgt 5220 
tttaaatgga gtgtcttctt cccagttttc 5280 
taagtaatcc aattcggcta agcggctgtc 534 0 
gtcgatggag tgaaagagcc tgatgcactc 54 0 0 
ttgttcatct tcatactctt ccgagcaaag 5460 
gctccagcca tcatgccgtt caaagtgcag 5520 
ccatagcatc atgtcctttt cccgttccac 5580 
cgtcattttt aaatataggt tttcattttc 5640 
cattccttcc gtatctttta cgcagcggta 5700 
tattctcatt ttagccattt attatttcct 5760 
caagaagcta attataacaa gacgaactcc 5 820 
aaataccaga aaacagcttt ttcaaagttg 5880 
acggagccga ttttgaaacc gcggtgatca 5 94 0 
caacatgcta ccctccgcga gatcatccgt 6000 
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gtttcaaacc cggcagctta gttgccgttc 
tctgccgcct tacaacggct ctcccgctga 
cgagtggtga ttttgtgccg agctgccggt 
tatattgtgg tgtaaacaaa ttgacgctta 
taatgtactg aattaacgcc gaattaattc 
gttttaggaa ttagaaattt tattgataga 
ggtfctcttat atgctcaaca catgagcgaa 
ggaactactc acacattatt atggagaaac 
ggacggggcg gtaccggcag gctgaagtcc 
ccgtgcttga agccggccgc ccgcagcatg 
atgcgcacgc tcgggtcgtt gggcagcccg 
gcctccaggg acttcagcag gt^ggtgtag 
cggggggaga cgtacacggt cgactcggcc 
gggcccgcgt aggcgatgcc ggcgacctcg 
cgctcccgca gacggacgag gtcgtccgtc 
aagttgaccg tgcttgtctc gatgtagtgg 
gcctcggtgg cacggcggat gtcggccggg 
gagatagatt tgtagagaga gactggtgat 
ttccttatat agaggaaggt cttgcgaagg 
agtggagata tcacatcaat ccacttgctt 
cacgatgctc ctcgtgggtg ggggtccatc 
aacgatagcc tttcctttat cgcaatgatg 
tgtccttttg atgaagtgac agatagctgg 
taccctttgt tgaaaagtct caatagccct 
cttggagtag acgagagtgt cgtgctccac 
agacgtggtt ggaacgtctt ctttttccac 
gggaccactg tcggcagagg catcttgaac 
ttfcgtaggtg ccaccttcct tttctactgt 
atggaatccg aggaggtttc ccgatattac 
gtcttctgag actgtatctt tgatattctt 
gttggcaagc tgctctagcc aatacgcaaa 
taatgcagct ggcacgacag gtttcccgac 
aatgtgagtt agctcactca ttaggcaccc 
atgttgtgtg gaattgfcgag cggataacaa 
tacgaattcg agccttgact agagggtcga 
gagtttggac aaaccacaac tagaatgcag 
gatgctattg ctttatttgt aaccattata 
gaactccagc atgagatccc cgcgctggag 
tccgaagccc aacctttcat agaaggcggc 
gtcctgctcc tcggccacga agtgcacgca 
ccgcccccac ggctgctcgc cgatctcggt 
cgtggacacg acctccgacc actcggcgta 
ggccagggtg ttgtccggca ccacctggtc 
gtcccggacc acaccggcga agtcgtcctc 
ggtccagaac tcgaccgctc cggcgacgtc 
caacttggcc atggatccag atttcgctca 
gcaggaattc gatcgacact ctcgtctact 
accaaagggc tattgagact tttcaacaaa 
attgcccagc tatctgtcac ttcatcaaaa 
aatgccatca ttgcgataaa ggaaaggcta 
ccaaagatgg acccccaccc acgaggagca 
cttcaaagca agtggattga tgtgataaca 
agaatatcaa agatacagtc tcagaagacc 
taatatcggg aaacctcctc ggattccatt 
cagtagaaaa ggaaggtggc acctacaaat 
ttcaagatgc ctctgccgac agtggtccca 
tggaaaaaga agacgttcca accacgtctt 
ctgacgtaag ggatgacgca caatcccact 
aagttcattt catttggaga ggacacgctg 
tctctcgagc tttcgcagat ccgggggggc 
cgacgtctgt cgagaagttt ctgatcgaaa 
tctcggaggg cgaagaatct cgtgctttca 
tgcgggtaaa tagctgcgcc gatggtttct 
catcggccgc gctcccgatt ccggaagtgc 
cctattgcat ctcccgccgt gcacagggtg 
tgcccgctgt tctacaaccg gtcgcggagg 
gccagacgag cgggttcggc ccattcggac 



ttccgaatag catcggtaac atgagcaaag 60 60 
cgccgtcccg gactgatggg ctgcctgtat 612 0 
cggggagctg ttggctggct ggtggcagga 618 0 
gacaacttaa taacacattg cggacgtttt 6240 
gggggatctg gattttagta ctggattttg 63 00 
agtattttac aaatacaaat acatactaag 63 60 
accctatagg aaccctaatt cccttatctg 6420 
tcgagtcaaa tctcggtgac gggcaggacc 6480 
agctgccaga aacccacgtc atgccagttc 654 0 
ccgcgggggg catatccgag cgcctcgtgc 66 0 0 
atgacagcga ccacgctctt gaagccctgt 666 0 
agcgtggagc ccagtcccgt ccgctggtgg 672 0 
gtccagtcgt aggcgttgcg tgccttccag 67 80 
ccgtccacct cggcgacgag ccagggatag 684 0 
cactcctgcg gttcctgcgg ctcggtacgg 6900 
ttgacgatgg tgcagaccgc cggcatgtcc 6960 
cgtcgttctg ggctcatggt agactcgaga 702 0 
ttcagcgtgt cctctccaaa tgaaatgaac 70 8 0 
atagtgggat tgtgcgtcat cccttacgtc 714 0 
tgaagacgtg gttggaacgt cttctttttc 72 00 
tttgggacca ctgtcggcag aggcatcttg 72 6 0 
gcatttgtag gtgccacctt ccttttctac 7320 
gcaatggaat ccgaggaggt ttcccgatat 73 8 0 
ttggtcttct gagactgtat ctttgatatt 7440 
catgttatca catcaatcca cttgctttga 7500 
gatgctcctc gtgggtgggg gtccatcttt 7560 
gatagccttt cctttatcgc aatgatggca 7620 
ccttttgatg aagtgacaga tagctgggca 76 80 
cctttgttga aaagtctcaa tagccctttg 7740 
ggagtagacg agagtgtcgt gctccaccat 7800 
ccgcctctcc ccgcgcgttg gccgattcat 7860 
tggaaagcgg gcagtgagcg caacgcaatt 792 0 
caggctttac actttatgct tccggctcgt 7980 
tttcacacag gaaacagcta tgaccatgat 804 0 
cggtatacag acatgataag atacattgat 8100 
tgaaaaaaat gctttatttg tgaaatttgt 8160 
agctgcaata aacaagttgg ggtgggcgaa 822 0 
gatcatccag ccggcgtccc ggaaaacgat 82 80 
ggtggaatcg aaatctcgta gcacgtgtca 834 0 
gttgccggcc gggtcgcgca gggcgaactc 84 00 
catggccggc ccggaggcgt cccggaagtt 84 6 0 
cagctcgtcc aggccgcgca cccacaccca 852 0 
ctggaccgcg ctgatgaaca gggtcacgtc 858 0 
cacgaagtcc cgggagaacc cgagccggtc 864 0 
gcgcgcggtg agcaccggaa cggcactggt 87 00 
agttagtata aaaaagcagg cttcaatcct 87 60 
ccaagaatat caaagataca gtctcagaag 882 0 
gggtaatatc gggaaacctc ctcggattcc 88 8 0 
ggacagtaga aaaggaaggt ggcacctaca 8940 
tcgttcaaga tgcctctgcc gacagtggtc 90 00 
tcgtggaaaa agaagacgtt ccaaccacgt 90 60 
tggtggagca cgacactctc gtctactcca 912 0 
aaagggctat tgagactttt caacaaaggg 91 BO 
gcccagctat ctgtcacttc atcaaaagga 9240 
gccatcattg cgataaagga aaggctatcg 93 00 
aagatggacc cccacccacg aggagcatcg 93 60 
caaagcaagt ggattgatgt gatatctcca 9420 
atccttcgca agaccttcct ctatataagg 94 80 
aaatcaccag tctctctcta caaatctatc 9540 
aatgagatat gaaaaagcct gaactcaccg 9600 
agttcgacag cgtctccgac ctgatgcagc 9660 
gcttcgatgt aggagggcgt ggatatgtcc 972 0 
acaaagatcg ttatgtttat cggcactttg 9780 
ttgacattgg ggagtttagc gagagcctga 984 0 
tcacgttgca agacctgcct gaaaccgaac 9900 
ctatggatgc gatcgctgcg gccgatctta 9960 
cgcaaggaat cggtcaatac actacatggc 1002 0 
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gtgatttcat 
acaccgtcag 
gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
ggccgtcgtt 
tgcagcacat 
ttcccaacag 
tgtcgtttcc 
cctaagagaa 
tccgttcgtc 



atgcgcgatt 
tgcgtccgtc 
ccggcacctc 
aacagcggtc. 
catcttcttc 
gaggcatccg 
tgaccaactc 
tcgatgcgac 
cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
ttacaacgtc 
ccccctttcg 
ttgcgcagcc 
cgccttcagt 
aagagcgttt 
catttgtatg 



gctgatcccc 
gcgcaggctc 
gtgcacgcgg 
attgactgga 
tggaggccgt 
gagcttgcag 
tatcagagct 
gcaatcgtcc 
gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
ccgaattaat 
gtgactggga 
ccagctggcg 
tgaatggcga 
ttaaactatc 
attagaataa 
tg 



atgtgtatca 
tcgatgagct 
atttcggctc 
gcgaggcgat 
ggttggcttg 
gatcgccacg 
tggttgacgg 
gatccggagc 
ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaaccctggc 
taatagcgaa 
atgctagagc 
agtgtttgac 
cggatattta 



ctggcaaact 
gatgctttgg 
caacaatgtc 
gttcggggat 
tatggagcag 
actccgggcg 
caatttcgat 
cgggactgtc 
tgtagaagta 
at agagt aga 
gtgagtagtt 
atataagaaa 
aattcctaaa 
ttcagatcaa 
gttacccaac 
gaggcccgca 
agcttgagct 
aggatatatt 
aaagggcgtg 



gtgatggacg 
gccgaggact 
ctgacggaca 
tcccaatacg 
cagacgcgct 
tatatgctcc 
gatgcagctt 
gggcgt acac 
ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttggcact 
ttaatcgcct 
ccgatcgccc 
tggatcagat 
ggcgggtaaa 
aaaaggttta 



10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11182 



<210> 2 
<211> 8428 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambia3300 plasmid 



<400> 2 

catgccaacc 

atagtgcagt 

agtcctaagt 

gttttagtcg 

agagcgccgc 

ccaaccaacg 

ccggcaccag 

acgttgtgac 

ttgccgagcg 

acaccaccac 

agcgttccct 

tgaagtttgg 

tcgaccagga 

ccctgtaccg 

gtgccttccg 

gccaagagga 

cgaagagatc 

ctcaaccgtg 

gccggccagc 

tgagtaaaac 

aatacgcaag 

aagacgacca 

ttagtcgatt 

ccgctaaccg 

cggcgcgact 

atcaaggcag 

accgccgacc 

gcggcctttg 

gcgctggccg 

ccaggcactg 

cgcgaggtcc 

aagagaaaat 

gcaaggctgc 

agttgccggc 

ttaccgagct 

atgagtagat 

accgacgccg 



acagggttcc 
cggcttctga 
tacgcgacag 
cataaagtag 
cgctggcctg 
ggccgaactg 
gcgcgaccgc 
agtgaccagg 
catccaggag 
gccggccggc 
aatcatcgac 
cccccgccct 
aggccgcacc 
cgcacttgag 
tgaggacgca 
acaagcatga 
gaggcggaga 
cggctgcatg 
ttggccgctg 
agcttgcgtc 
gggaacgcat 
tcgcaaccca 
ccgatcccca 
ttgtcggcat 
tcgtagtgat 
ccgacttcgt 
tggtggagct 
tcgtgtcgcg 
ggtacgagct 
ccgccgccgg 
aggcgctggc 
gagcaaaagc 
aacgttggcc 
ggaggatcac 
gctatctgaa 
gaattttagc 
tggaatgccc 



cctcgggatc 
cgttcagtgc 
gctgccgccc 
aatacttgcg 
ctgggctatg 
cacgcggccg 
ccggagctgg 
ctagaccgcc 
gccggcgcgg 
cgcatggtgt 
cgcacccgga 
accctcaccc 
gtgaaagagg 
cgcagcgagg 
1 1 gaccgagg 
aaccgcacca 
tgatcgcggc 
aaatcctggc 
aagaaaccga 
atgcggtcgc 
gaaggttatc 
tctagcccgc 
gggcagtgcc 
cgaccgcccg 
cgacggagcg 
gctgattccg 
ggttaagcag 
ggcgatcaaa 
gcccattctt 
cacaaccgtt 
cgctgaaatt 
acaaacacgc 
agcctggcag 
accaagctga 
tacatcgcgc 
ggctaaagga 
catgtgtgga 



aaagtacttt 
agccgtcttc 
tgcccttttc 
actagaaccg 
cccgcgtcag 
gctgcaccaa 
ccaggatgct 
tggcccgcag 
gcctgcgtag 
tgaccgtgtt 
gcgggcgcga 
cggcacagat 
cggctgcact 
aagtgacgcc 
ccgacgccct 
ggacggccag 
cgggtacgtg 
cggtttgtct 
gcgccgccgt 
tgcgtatatg 
gctgtactta 
gccctgcaac 
cgcgattggg 
acgattgacc 
ccccaggcgg 
gtgcagccaa 
cgcattgagg 
ggcacgcgca 
gagtcccgta 
cttgaatcag 
aaatcaaaac 
taagtgccgg 
acacgccagc 
agatgtacgc 
agctaccaga 
ggcggcatgg 
ggaacgggcg 



gatccaaccc 
tgaaaacgac 
ctggcgtttt 
gagacattac 
caccgacgac 
gctgtt ttcc 
tgaccaccta 
cacccgcgac 
cctggcagag 
cgccggcatt 
ggccgccaag 
cgcgcacgcc 
gcttggcgtg 
caccgaggcc 
ggcggccgcc 
gacgaaccgt 
ttcgagccgc 
gatgccaagc 
ctaaaaaggt 
atgcgatgag 
accagaaagg 
tcgccggggc 
cggccgtgcg 
gcgacgtgaa 
cggacttggc 
gcccttacga 
tcacggatgg 
tcggcggtga 
tcacgcagcg 
aa c c eg aggg 
tcatttgagt 
ccgtccgagc 
catgaagegg 
ggtacgccaa 
gtaaatgagc 
aaaatcaaga 
gttggccagg 



ctccgctgct 
atgtcgcaca 
cttgtcgcgt 
gecatgaaca 
caggacttga 
gaga agat ca 
cgccctggcg 
ctactggaca 
ccgtgggccg 
gecgagtteg 
gcccgaggcg 
cgcgagctga 
catcgctcga 
aggeggegeg 
gagaatgaac 
ttttcattac 
ccgcgcacgt 
tggcggcctg 
gatgtgtatt 
taaataaaca 
egggtcagge 
cgatgttctg 
ggaagatcaa 
ggccatcggc 
tgtgtccgcg 
catatgggee 
aaggctacaa 
ggttgccgag 
cgtgagctac 
cgacgctgcc 
taatgaggta 
gcacgcagca 
gtcaactttc 
ggcaagacca 
aaatgaataa 
acaaccaggc 
egtaagegge 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 
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tgggttgtct 
cggtcgcaaa 
gaagttgaag 
tgaatcgtgg 
cggtgcgccg 
gatgctctat 
tctgtcgaag 
cgtagaggtt 
gatggcggtt 
gcccggccgc 
tggcggaaag 
tgccatgcag 
agccttgat t 
gatcgagcta 
gacggttcac 
ggcacgccgc 
cagtggcagc 
aaatgacctg 
catgcgctac 
gatgctaggg 
tagcacgtac 
cccaaagccg 
aggcgattfct 
ctgtgcataa 
gtcgctgcgc 
aaaaatggct 
actcgaccgc 
aaaacctctg 
ggagcagaca 
tgacccagtc 
gattgtactg 
ataccgcatc 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
atattttatt 
ctgttcttcc 
gtccgccctg 
gatgttgctg 
ctttaaaaaa 
gcaatccaca 
taagctattc 
cgcatacagc 
gacgccatcg 
gacctttgga 
atcataggtg 
tcccaccagc 
tttttcgatc 
tcctcttttc 
aattcactgt 
ttttcaaagt 
caggcagcaa 
gtttcaaacc 
tctgccgcct 
cgagtggtga 
tatattgtgg 



gccggccctg 
ccatccggcc 
gccgcgcagg 
caagcggccg 
tcgattagga 
gacgtgggca 
cgtgaccgac 
tccgcagggc 
tcccatctaa 
gtgttccgtc 
cagaaagacg 
cgtacgaaga 
agccgctaca 
gctgafcfcgga 
cccgattact 
gc cgcaggca 
gccggagagt 
ccggagtacg 
cgcaacctga 
caaattgccc 
attgggaacc 
tacattggga 
tccgcctaaa 
ctgtctggcc 
tccctacgcc 
ggcctacggc 
cggcgcccac 
acacatgcag 
agcccgtcag 
acgtagcgat 
agagtgcacc 
aggcgctctt 
gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
ttctcccaat 
ccgatatcct 
ccgcttctcc 
tctcccaggt 
tcatacagct 
tcggccagat 
gtatagggac 
tcgataatct 
gcctcactca 
acaggcagct 
gtccctttat 
ttatatacct 
agttttttca 
tacagtattt 
tccttgcatt 
tggcgtataa 
cgctctgtca 
cggcagctta 
tacaacggct 
ttttgtgccg 
tgtaaacaaa 



caatggcact 
cggtacaaat 
ccgcccagcg 
ctgatcgaat 
agccgcccaa 
cccgcgatag 
gagctggcga 
cggccggcat 
ccgaatccat 
cacacgttgc 
acctggtaga 
aggccaagaa 
agatcgtaaa 
tgtaccgcga 
ttttgatcga 
aggcagaagc 
tcaagaagtt 
atttgaagga 
tcgagggcga 
tagcagggga 
caaagccgta 
accggtcaca 
actctttaaa 
agcgcacagc 
ccgccgcttc 
caggcaatct 
atcaaggcac 
ctcccggaga 
ggcgcgtcag 
agcggagtgt 
atatgcggtg 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgcattctag 
caggct tgat 
ccctgatcga 
caagatcaat 
cgccgtggga 
cgcgcggatc 
cgttattcag 
aatccgatat 
tttcagggct 
tgagcagatt 
ttccttccag 
accggctgtc 
tagcaggaga 
attccggtga 
aaagataccc 
ctaaaacctt 
catagtatcg 
tcgttacaat 
gttgccgttc 
ctcccgctga 
agctgccggt 
ttgacgctta 



ggaaccccca 
cggcgcggcg 
gcaacgcatc 
ccgcaaagaa 
gggcgacgag 
tcgcagcatc 
ggtgatccgc 
ggccagtgtg 
gaaccgatac 
ggacgtactc 
aacctgcatt 
cggccgcctg 
gagcgaaacc 
gatcacagaa 
tcccggcatc 
cagatggttg 
ctgtttcacc 
ggaggcgggg 
agcatccgcc 
aaaaggtcga 
cattgggaac 
catgtaagtg 
acttattaaa 
cgaagagctg 
gcgtcggcct 
accagggcgc 
cctgcctcgc 
cggtcacagc 
cgggtgttgg 
atactggctt 
tgaaataccg 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
gtactaaaac 
ccccagtaag 
ccggacgcag 
aaagccactt 
aaagacaagt 
tttaaatgga 
taagtaatcc 
gtcgatggag 
ttgttcatct 
gctccagcca 
ccatagcatc 
cgtcattttt 
cattccttcc 
tattctcatt 
caagaagcta 
aaataccaga 
acggagccga 
caacatgcta 
ttccgaatag 
cgccgtcccg 
cggggagctg 
gacaacttaa 



agcccgagga 
ctgggtgatg 
gaggcagaag 
tcccggcaac 
caaccagatt 
atggacgtgg 
tacgagcttc 
tgggattacg 
cgggaaggga 
aagttctgcc 
cggttaaaca 
gtgacggtat 
gggcggc egg 
ggcaagaacc 
ggccgttttc 
ttcaagacga 
gtgegcaage 
caggctggcc 
ggttcctaat 
aaaggtctct 
cggaacccgt 
actgatataa 
actcttaaaa 
caaaaagege 
atcgcggccg 
ggac aagc eg 
gcgtttcggt 
ttgtctgtaa 
cgggtgtcgg 
aactatgegg 
cacagatgeg 
tcgctgcgct 
eggttatcca 
aaggccagga 
gacgagcatc 
agataccagg 
ettaceggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgegea 
gctcagtgga 
aattcatcca 
tcaaaaaata 
aaggcaatgt 
actttgecat 
tcctcttcgg 
gtgtcttctt 
aatteggcta 
tgaaagagee 
tcatactctt 
teatgeegtt 
atgtcctttt 
aaatataggt 
gtatctttta 
ttagccattt 
attataacaa 
aaacagcttt 
ttttgaaacc 
ccctccgcga 
categgtaac 
gactgatggg 
ttggctggct 
taacacattg 



atcggcgtga 
acctggtgga 
cacgccccgg 
cgccggcagc 
ttttcgttcc 
ccgttttccg 
cagaegggea 
acctggtact 
agggagacaa 
ggegagcega 
ccacgcacgt 
ccgagggtga 
agtacatcga 
eggaegtget 
tctaccgcct 
tctacgaacg 
tgatcgggtc 
cgatcctagt 
gtaeggagea 
ttcctgtgga 
acattgggaa 
aagagaaaaa 
cccgcctggc 
ctacccttcg 
ctggccgctc 
cgccgtcgcc 
gatgaeggtg 
gcggatgccg 
ggcgcagcca 
catcagagca 
taaggagaaa 
eggtegtteg 
cagaatcagg 
acegtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgetacaga 
gtatctgege 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
gtaaaatata 
gctcgacata 
cataccactt 
ctttcacaaa 
gettttcegt 
cccagttttc 
agcggctgtc 
tgatgeaetc 
ccgagcaaag 
caaagtgcag 
cccgttccac 
tttcattttc 
cgcagcggta 
attatttcct 
gacgaactcc 
ttcaaagttg 
gcggtgatca 
gatcatccgt 
atgagcaaag 
ctgcctgtat 
ggtggcagga 
cggacgtttt 



2280 
2340 
240O 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
450O 
4560 
4620 
4680 
4740 
480O 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
570O 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
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taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
fcttgtaggtg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
ggcactggcc 
tcgccttgca 
tcgcccttcc 
tcagattgtc 
ggtaaaccta 
ggtttatccg 



aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccacctfccct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgfcgag 
agctcggtac 
gtcgttttac 
gcacafccccc 
caacagttgc 
gtttcccgcc 
agagaaaaga 
ttcgtccatt 



gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
ccggggatcc 
aacgtcgtga 
ctttcgccag 
gcagcctgaa 
ttcagtttaa 
gcgtttatta 
tgtatgtg 
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gggggatctg 
agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgtg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
tctagagtcg 
ctgggaaaac 
ctggcgtaat 
tggcgaatgc 
actatcagtg 
gaataacgga 



gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gctcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acctgcaggc 
cctggcgtta 
agcgaagagg 
tagagcagct 
tttgacagga 
tatttaaaag 



ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atgcaagctt 
cccaacttaa 
cccgcaccga 
tgagcttgga 
tatattggcg 
ggcgtgaaaa 



<210> 3 
<211> 10549 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCambia!302 plasmid 
<300> 

<308> Genbank #AF234298 
<309> 2000-04-24 



6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7O80 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8428 



<400> 3 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatct tc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 



ctgactagta 
ggtgatgtta 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcatfcafcc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 



acttttcact 
attttctgtc 
tatttgcact 
tggtgttcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 



ggagttgtcc 
agtggagagg 
actggaaaac 
tgcttttcaa 
gagggatacg 
gctgaagtca 
ttcaaggagg 
gtafcacatca 
aacatcgaag 
gatggccctg 
gatcccaacg 
acacatggca 
ggtgaccagc 
gaatcctgtt 
tgtaataatt 
cccgcaatta 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 



60 

120 
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240 

300 

360 
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acgcgataga aaacaaaata tagcgcgcaa actaggataa attatcgcgc gcggtgtcat 1020 
ctatgttact agatcgggaa ttaaactatc agtgtttgac aggatatatt ggcgggtaaa 1080 
cctaagagaa aagagcgttt attagaataa cggatattta aaagggcgtg aaaaggttta 114 0 
tccgttcgtc catttgtatg tgcatgccaa ccacagggtt cccctcggga tcaaagtact 1200 
ttgatccaac ccctccgctg ctatagtgca gtcggcttct gacgttcagt gcagccgtct 1260 
tctgaaaacg acatgtcgca caagtcctaa gttacgcgac aggctgccgc cctgcccttt 1320 
tcctggcgtt ttcttgtcgc gtgttttagt cgcataaagt agaatacttg cgactagaac 1380 
cggagacatt acgccatgaa caagagcgcc gccgctggcc tgctgggcta tgcccgcgtc 1440 
agcaccgacg accaggactt gaccaaccaa cgggccgaac tgcacgcggc cggctgcacc 1500 
aagctgtttt ccgagaagat caccggcacc aggcgcgacc gcccggagct ggccaggatg 156 0 
cttgaccacc tacgccctgg cgacgttgtg acagtgacca ggctagaccg cc tggcccgc 162 0 
agcacccgcg acctactgga cattgccgag cgcatccagg aggccggcgc gggcctgcgt 168 0 
agcctggcag agccgtgggc cgacaccacc acgccggccg gccgcatggt gttgaccgtg 174 0 
ttcgccggca ttgccgagtt cgagcgttcc ctaatcatcg accgcacccg gagcgggcgc 1800 
gaggccgcca aggcccgagg cgtgaagttt ggcccccgcc ctaccctcac cccggcacag 186 0 
atcgcgcacg cccgcgagct gatcgaccag gaaggccgca ccgtgaaaga ggcggctgca 192 0 
ctgcttggcg tgcatcgctc gaccctgtac cgcgcacttg agcgcagcga ggaagtgacg 1980 
cccaccgagg ccaggcggcg cggtgccttc cgtgaggacg cattgaccga ggccgacgcc 2 04 0 
ctggcggccg ccgagaatga acgccaagag gaacaagcat gaaaccgcac caggacggcc 2100 
aggacgaacc gtttttcatt accgaagaga tcgaggcgga gatgatcgcg gccgggtacg 2160 
tgttcgagcc gcccgcgcac gtctcaaccg tgcggctgca tgaaatcctg gccggtttgt 2220 
ctgatgccaa gctggcggcc tggccggcca gcttggccgc tgaagaaacc gagcgccgcc 22 80 
gtctaaaaag gtgatgtgta tttgagtaaa acagcttgcg tcatgcggtc gctgcgtata 234 0 
tgatgcgatg agtaaataaa caaatacgca aggggaacgc atgaaggtta tcgctgtact 24 0 0 
taaccagaaa ggcgggtcag gcaagacgac catcgcaacc catctagccc gcgccctgca 24 60 
actcgccggg gccgatgttc tgttagtcga ttccgatccc cagggcagtg cccgcgattg 2 52 0 
ggcggccgtg cgggaagatc aaccgctaac cgttgtcggc atcgaccgcc cgacgattga 2580 
ccgcgacgtg aaggccatcg gccggcgcga cttcgtagtg atcgacggag cgccccaggc 264 0 
ggcggacttg gctgtgtccg cgatcaaggc agccgacttc gtgctgattc cggtgcagcc 270 0 
aagcccttac gacatatggg ccaccgccga cctggtggag ctggttaagc agcgcattga 2 760 
ggtcacggat ggaaggctac aagcggcctt tgtcgtgtcg cgggcgatca aaggcacgcg 2 82 0 
catcggcggt gaggttgccg aggcgctggc cgggtacgag ctgcccattc ttgagtcccg 2880 
tatcacgcag cgcgtgagct acccaggcac tgccgccgcc ggcacaaccg ttcttgaatc 2940 
agaacccgag ggcgacgctg cccgcgaggt ccaggcgctg gccgctgaaa ttaaatcaaa 3 000 
actcatttga gttaatgagg taaagagaaa atgagcaaaa gcacaaacac gctaagtgcc 3 060 
ggccgtccga gcgcacgcag cagcaaggct gcaacgttgg ccagcctggc agacacgcca 3120 
gccatgaagc gggtcaactt tcagttgccg gcggaggatc acaccaagct gaagatgtac 3180 
gcggtacgcc aaggcaagac cattaccgag ctgctatctg aatacatcgc gcagctacca 324 0 
gagtaaatga gcaaatgaat aaatgagtag atgaatttta gcggctaaag gaggcggcat 33 00 
ggaaaatcaa gaacaaccag gcaccgacgc cgtggaatgc cccatgtgtg gaggaacggg 33 60 
cggttggcca ggcgtaagcg gctgggttgt ctgccggccc tgcaatggca ctggaacccc 3 420 
caagcccgag gaatcggcgt gacggtcgca aaccatccgg cccggtacaa atcggcgcgg 3480 
cgctgggtga tgacctggtg gagaagttga aggccgcgca ggccgcccag cggcaacgca 3 54 0 
tcgaggcaga agcacgcccc ggtgaatcgt ggcaagcggc cgctgatcga atccgcaaag 3 6 00 
aatcccggca accgccggca gccggtgcgc cgtcgattag gaagccgccc aagggcgacg 3 660 
agcaaccaga ttttttcgtt ccgatgctct atgacgtggg cacccgcgat agtcgcagca 3720 
tcatggacgt ggccgttttc cgtctgtcga agcgtgaccg acgagctggc gaggtgatcc 3 780 
gctacgagct tccagacggg cacgtagagg tttccgcagg gccggccggc atggccagtg 3 84 0 
tgtgggatta cgacctggta ctgatggcgg tttcccatct aaccgaatcc atgaaccgat 3 900 
accgggaagg gaagggagac aagcccggcc gcgtgttccg tccacacgtt gcggacgtac 3 960 
tcaagttctg ccggcgagcc gatggcggaa agcagaaaga cgacctggta gaaacctgca 4 020 
ttcggttaaa caccacgcac gttgccatgc agcgtacgaa gaaggccaag aacggccgcc 4 0 80 
tggtgacggt atccgagggt gaagccttga ttagccgcta caagatcgta aagagcgaaa 4140 
ccgggcggcc ggagtacatc gagatcgagc tagctgattg gatgtaccgc gagatcacag 42 00 
aaggcaagaa cccggacgtg ctgacggttc accccgatta ctttttgatc gatcccggca 4260 
tcggccgttt tctctaccgc ctggcacgcc gcgccgcagg caaggcagaa gccagatggt 4320 
tgttcaagac gatctacgaa cgcagtggca gcgccggaga gttcaagaag ttctgtttca 4 3 80 
ccgtgcgcaa gctgatcggg tcaaatgacc tgccggagta cgatttgaag gaggaggcgg 4440 
ggcaggctgg cccgatccta gtcatgcgct accgcaacct gatcgagggc gaagcatccg 4500 
ccggttccta atgtacggag cagatgctag ggcaaattgc cctagcaggg gaaaaaggtc 4 560 
gaaaaggtct ctttcctgtg gatagcacgt acattgggaa cccaaagccg tacattggga 4 620 
accggaaccc gtacattggg aacccaaagc cgtacattgg gaaccggtca cacatgtaag 4680 
tgactgatat aaaagagaaa aaaggcgatt tttccgccta aaactcttta aaacttatta 4740 
aaactcttaa aacccgcctg gcctgtgcat aactgtctgg ccagcgcaca gccgaagagc 4800 
tgcaaaaagc gcctaccctt cggtcgctgc gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctatcgcggc cgctggccgc tcaaaaatgg ctggcctacg gccaggcaat ctaccagggc 4 920 
gcggacaagc cgcgccgtcg ccactcgacc gccggcgccc acatcaaggc accctgcctc 4 980 
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gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
ttaactatgc 
cgcacagatg 
actcgcfcgcg 
tacggttatc 
aaaaggccag 
ctgacgagca 
aaagatacca 
cgcttaccgg 
cacgctgtag 
aaccccccgt 
cggtaagaca 
ggtatgtagg 
ggacagtatt 
gctcttgatc 
agattacgcg 
acgctcagtg 
acaattcatc 
agtcaaaaaa 
agaaggcaat 
ttactttgcc 
gttcctcttc 
gagtgtcttc 
ccaattcggc 
agtgaaagag 
cttcatactc 
catcatgccg 
tcatgtcctt 
ttaaatatag 
ccgtatcttt 
ttttagccat 
taattataac 
gaaaacagct 
gattttgaaa 
taccctccgc 
agcatcggta 
cggactgatg 
tgttggctgg 
aataacacat 
tggattttag 
acaaatacaa 
ggaaccctaa 
gtcgatcgac 
gcgtcggttt 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctatttacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 



gtgatgacgg 
aagcggatgc 
ggggcgcagc 

ggcatcagag 
cgtaaggaga 
ctcggtcgtt 
cacagaatca 
gaaccgtaaa 
tcacaaaaat 
ggcgtttccc 
atacctgtcc 
gtatctcagt 
tcagcccgac 
cgacttatcg 
cggtgctaca 
tggtatctgc 
cggcaaacaa 
cagaaaaaaa 
gaacgaaaac 
cagtaaaata 
tagctcgaca 
gtcataccac 
atctttcaca 
gggcttttcc 
ttcccagttt 
taagcggctg 
cctgatgcac 
ttccgagcaa 
ttcaaagtgc 
ttcccgttcc 
gttttcattt 
tacgcagcgg 
ttattatttc 
aagacgaact 
ttttcaaagt 
ccgcggtgat 
gagatcatcc 
acatgagcaa 
ggctgcctgt 
ctggtggcag 
tgcggacgtt 
tactggattt 
atacatacta 
ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagctgca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 



tgaaaacctc 
cgggagcaga 
catgacccag 
cagattgtac 
aaataccgca 
cggctgcggc 
ggggataacg 
aaggccgcgt 
cgacgctcaa 
cctggaagct 
gcctttctcc 
tcggtgtagg 
cgctgcgcct 
ccactggcag 
gagttcttga 
gctctgctga 
accaccgctg 
ggatctcaag 
tcacgttaag 
taatatttta 
tactgttctt 
ttgtccgccc 
aagatgfctgc 
gtctttaaaa 
tcgcaatcca 
tctaagctat 
tccgcataca 
aggacgccat 
aggacctttg 
acatcatagg 
tctcccacca 
tatttttcga 
cttcctcttt 
ccaattcact 
tgttttcaaa 
cacaggcagc 
gtgtttcaaa 
agtctgccgc 
atcgagtggt 
gatatattgt 
tttaatgtac 
tggttttagg 
agggtttctt 
tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagtttg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 



tgacacatgc 
caagcccgtc 
tcacgtagcg 
tgagagtgca 
tcaggcgctc 
gagcggtatc 
caggaaagaa 
tgctggcgtt 
gtcagaggtg 
ccctcgtgcg 
cttcgggaag 
tcgttcgctc 
tatccggtaa 
cagccactgg 
agtggtggcc 
agccagttac 
gtagcggtgg 
aagatccttt 
ggattttggt 
ttttctccca 
ccccgatatc 
tgccgcttct 
tgtctcccag 
aatcatacag 
catcggccag 
tcgtataggg 
gctcgataat 
cggcctcact 
gaacaggcag 
tggtcccttt 
gcttatatac 
tcagtttttt 
tctacagtat 
gttccttgca 
gttggcgtat 
aacgctctgt 
cccggcagct 
cttacaacgg 
gattttgtgc 
ggtgtaaaca 
tgaattaacg 
aattagaaat 
atatgctcaa 
tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 
atcgcaatga 



agctcccgga 
agggcgcgtc 
atagcggagt 
ccatatgcgg 
ttccgcttcc 
agctcactca 
catgtgagca 
tttccatagg 
gcgaaacccg 
ctctcctgtt 
cgtggcgctt 
caagctgggc 
ctatcgtctt 
taacaggatt 
taactacggc 
cttcggaaaa 
tttttttgtt 
gatcttttct 
catgcattct 
atcaggcttg 
ctccctgatc 
cccaagatca 
gtcgccgtgg 
ctcgcgcgga 
atcgttattc 
acaatccgat 
cttttcaggg 
catgagcaga 
ctttccttcc 
ataccggctg 
cttagcagga 
caattccggt 
ttaaagatac 
ttctaaaacc 
aacatagtat 
catcgttaca 
tagttgccgt 
ctctcccgct 
cgagctgccg 
aattgacgct 
ccgaattaat 
tttattgata 
cacatgagcg 
ttatggagaa 
gccctcggac 
tcggtccaga 
gatcggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 



gacggtcaca 
agcgggtgtt 
gtatactggc 
tgtgaaatac 
tcgctcactg 
aaggcggtaa 
aaaggccagc 
ctccgccccc 
acaggactat 
ccgaccctgc 
tctcatagct 
tgtgtgcacg 
gagtccaacc 
agcagagcga 
tacactagaa 
agagttggta 
tgcaagcagc 
acggggtctg 
aggtacta'aa 
atccccagta 
gaccggacgc 
ataaagccac 
gaaaagacaa 
tctttaaatg 
agtaagtaat 
atgtcgatgg 
ctttgttcat 
ttgctccagc 
agccatagca 
tccgtcattt 
gacattcctt 
gatattctca 
cccaagaagc 
ttaaatacca 
cgacggagcc 
atcaacatgc 
tcttccgaat 
gacgccgtcc 
gtcggggagc 
tagacaactt 
tcgggggatc 
gaagtatttt 
aaaccctata 
actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ccggaatcgg 
catcggcgca 
cacgagattc 
tcagaaactt 
cgggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 



5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 
7200 
7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 
8820 
8880 
8940 
9000 
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ttccttttct 
gtttcccgat 
atctttgata 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
gcatgcaagc 
tacccaactt 
ggcccgcacc 
cttgagcttg 
tcaaafcagag 
cttacgactc 
ctactccaaa 
acaaagggta 
tgtgaagata 
ggccatcgtt 
gagcatcgtg 
tatctccact 
tatataagga 



actgtcctfcfc 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
ttggcactgg 
aatcgccttg 
gatcgccctt 
gatcagattg 
gacctaacag 
aatgacaaga 
aatatcaaag 
atatccggaa 
gtggaaaagg 
gaagatgcct 
gaaaaagaag 
gacgtaaggg 
agttcatttc 



tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 
gctgctctag 
ctggcacgac 
ttagctcact 
tggaattgtg 
cgagctcggt 
ccgtcgtttt 
cagcacatcc 
cccaacagtt 
tcgtttcccg 
aactcgccgt 
agaaaatctt 
atacagtctc 
acctcctcgg 
aaggtggctc 
ctgccgacag 
acgttccaac 
atgacgcaca 
atttggagag 



acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
acaacgtcgt 
ccctttcgcc 
gcgcagcctg 
ccttcagttt 
aaagactggc 
cgtcaacatg 
agaagaccaa 
attccattgc 
ctacaaatgc 
tggtcccaaa 
cacgtcttca 
atcccactat 
aacacggggg 



gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
acgatagcct 
gtccttttga 
accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagagt 
gactgggaaa 
agctggcgta 
aatggcgaat 
agcttcatgg 
gaacagt tea 
gtggagcacg 
agggcaattg 
ccagctatct 
catcattgeg 
gatggacccc 
aagcaagtgg 
ccttcgcaag 
actcttgac 



atccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
cgacctgcag 
accctggcgt 
atagegaaga 
gctagagcag 
agtcaaagat 
tacagagtct 
acacacttgt 
agacttttca 
gtcactttat 
ataaaggaaa 
cacccacgag 
attgatgtga 
acccttcctc 



9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10140 

10200 

10260 

10320 

10380 

10440 

10500 

10549 



<210> 4 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV35SpolyA Primer 
<400> 4 

ctgaattaac gecgaattaa ttegggggat ctg 

<210> 5 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> CaMV3 5Spr Primer 
<400> 5 

ctagagcagc ttgecaacat ggtggagca 

<210> 6 

<211> 12592 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAg2 Plasmid 



33 



29 



<400> 6 

gtacgaagaa 

gccgctacaa 

ctgattggat 

ccgattactt 

ccgcaggcaa 

ccggagagt t 

eggagtaega 

gcaacctgat 

aaattgeect 



ggecaagaac 
gategtaaag 
gtaccgegag 
tttgatcgat 
ggcagaagee 
caagaagttc 
tttgaaggag 
egagggegaa 
agcaggggaa 



ggccgcctgg 
agegaaaccg 
atcacagaag 
cccggcatcg 
agatggttgt 
tgtttcaccg 
gaggegggge 
gcatccgccg 
aaaggtcgaa 



tgaeggtate 
ggcggccgga 
gcaagaaccc 
geegttttet 
tcaagacgat 
tgegcaaget 
aggctggccc 
gttcctaatg 
aaggtctctt 



cgagggtgaa 
gtacatcgag 
ggacgtgctg 
ctaccgcctg 
ctacgaacgc 
gategggtea 
gatcctagtc 
tacggagcag 
tcctgtggat 



gecttgatta 
atcgagctag 
acggttcacc 
gcacgccgcg 
agtggcagcg 
aatgacctgc 
atgcgctacc 
atgetaggge 
ageaegtaca 



60 

120 

180 

240 

300 

360 

420 

480 

540 
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ttgggaaccc 
acattgggaa 
ccgcctaaaa 
tgtctggcca 
ccctacgccc 
gcctacggcc 
ggcgcccaca 
cacatgcagc 
gcccgtcagg 
cgtagcgata 
gagtgcacca 
ggcgctcttc 
cggtatcagc 
gaaagaacat 
tggcgttttt 
agaggtggcg 
tcgtgcgctc 
cgggaagcgt 
ttcgctccaa 
ccggtaacta 
ccactggtaa 

ggtggcctaa 

cagttacctt 
gcggtggttt 
atcctttgat 
ttttggtcat 
tctcccaatc 
cgatatcctc 
cgcttctccc 
ctcccaggtc 
catacagctc 
cggccagatc 
tatagggaca 
cgataatctt 
cctcactcat 
caggcagctt 
tccctttata 
tatatacctt 
gttttttcaa 
acagtattta 
ccttgcattc 
ggcgtataac 
gctctgtcat 
ggcagcttag 
acaacggctc 
tttgtgccga 
gtaaacaaat 
attaacgccg 
tagaaatttt 
tgctcaacac 
cacattatta 
taccggcagg 
gccggccgcc 
cgggtcgttg 
cttcagcagg 
gtacacggtc 
ggcgatgccg 
acggacgagg 
gcttgtctcg 
acggcggatg 
gta gagagag 
gaggaaggtc 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 



aaagccgtac 
ccggtcacac 
ctctttaaaa 
gcgcacagcc 
cgccgcttcg 
aggcaatcta 
tcaaggcacc 
tcccggagac 
gcgcgtcagc 
gcggagfcgfca 
tatgcggtgt 
cgcttcctcg 
tcactcaaag 
gtgagcaaaa 
ccataggctc 
aaacccgaca 
tcctgttccg 
ggcgctttct 
gctgggctgt 
tcgtcttgag 
caggattagc 
ctacggctac 
cggaaaaaga 
ttttgtttgc 
cttttctacg 
gcattctagg 
aggcttgatc 
cctgatcgac 
aagatcaata 
gccgtgggaa 
gcgcggatct 
gttattcagt 
atccgatatg 
ttcagggctt 
gagcagattg 
tccttccagc 
ccggctgtcc 
agcaggagac 
ttccggtgat 
aagatacccc 
taaaacctta 
atagtatcga 
cgttacaatc 
ttgccgttct 
tcccgctgac 
gctgccggtc 
tgacgcttag 
aattaattcg 
attgatagaa 
atgagcgaaa 
tggagaaact 
ctgaagtcca 
cgcagcatgc 
ggcagcccga 
tgggtgtaga 
gactcggccg 
gcgacctcgc 
tcgtccgtcc 
atgtagtggt 
tcggccgggc 
actggtgatt 
ttgcgaagga 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 



at: tgggaacc 
atgtaagtga 
cttattaaaa 
gaagagctgc 
cgtcggccta 
ccagggcgcg 
ctgcctcgcg 
ggtcacagct 
gggtgttggc 
tactggctta 
gaaataccgc 
ctcactgact 
gcggtaatac 
ggccagcaaa 
cgcccccctg 
ggactataaa 
accctgccgc 
catagctcac 
gtgcacgaac 
tccaacccgg 
agagcgaggt 
actagaagga 
gttggtagct 
aagcagcaga 
gggtctgacg 
tactaaaaca 
cccagtaagt 
cggacgcaga 
aagccactta 
aagacaagtt 
ttaaatggag 
aagtaatcca 
tcgatggagt 
tgttcatctt 
ctccagccat 
catagcatca 
gtcattttta 
attccttccg 
attctcattt 
aagaagctaa 
aataccagaa 
cggagccgat 
aacatgctac 
tccgaatagc 
gccgtcccgg 
ggggagctgt 
acaacttaat 
99ggatctgg 
gtattttaca 
ccctatagga 
cgagt caaat 
gctgccagaa 
cgcggggggc 
tgacagcgac 
gcgtggagcc 
tccagtcgta 
cgtccacctc 
actcctgcgg 
tgacgatggt 
gtcgttctgg 
tcagcgtgtc 
tagtgggatt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 



ggaacccgta 
ctgatataaa 
ctcttaaaac 
aaaaagcgcc 
tcgcggccgc 
gacaagccgc 
cgtttcggtg 
tgtctgtaag 
gggtgtcggg 
actatgcggc 
acagatgcgt 
cgctgcgctc 
ggttatccac 
aggccaggaa 
acgagcatca 
gataccaggc 
ttaccggata 
gctgtaggta 
cccccgttca 
taagacacga 
atgtaggcgg 
cagtatttgg 
cttgatccgg 
ttacgcgcag 
ctcagtggaa 
attcatccag 
caaaaaatag 
aggcaatgtc 
ctttgccatc 
cctcttcggg 
tgtcttcttc 
attcggctaa 
gaaagagcct 
catactcttc 
catgccgttc 
tgtccttttc 
aatataggtt 
tatcttttac 
tagccattta 
ttataacaag 
aacagctttt 
tttgaaaccg 
cctccgcgag 
atcggtaaca 
actgatgggc 
tggctggctg 
aacacattgc 
attttagtac 
aatacaaata 
accctaattc 
ctcggtgacg 
acccacgtca 
atatccgagc 
cacgctcttg 
cagtcccgtc 
ggcgttgcgt 
ggcgacgagc 
ttcctgcggc 
gcagaccgcc 
gctcatggta 
ctctccaaat 
gtgcgtcatc 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 



cattgggaac 
agagaaaaaa 
ccgcctggcc 
tacccttcgg 
tggccgctca 
gccgtcgcca 
atgacggtga 
cggatgccgg 
gcgcagccat 
atcagagcag 
aaggagaaaa 
ggtcgttcgg 
agaatcaggg 
ccgtaaaaag 
caaaaatcga 
gtttccccct 
cctgtccgcc 
tctcagttcg 
gcccgaccgc 
cttatcgcca 
tgctacagag 
tatctgcgct 
caaacaaacc 
aaaaaaagga 
cgaaaactca 
taaaatataa 
ctcgacatac 
ataccacttg 
tttcacaaag 
cttttccgtc 
ccagttttcg 
gcggctgtct 
gatgcactcc 
cgagcaaagg 
aaagtgcagg 
ccgttccaca 
ttcattttct 
gcagcggtat 
ttatttcctt 
acgaactcca 
tcaaagttgt 
cggtgatcac 
atcatccgtg 
tgagcaaagt 
tgcctgtatc 
gtggcaggat 
ggacgttttt 
tggattttgg 
catactaagg 
ccttatctgg 
ggcaggaccg 
tgccagttcc 
gcctcgtgca 
aagccctgtg 
cgctggtggc 
gccttccagg 
cagggatagc 
tcggtacgga 
ggcatgtccg 
gactcgagag 
gaaatgaact 
ccttacgtca 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 



ccaaagccgt 
ggcgattttt 
tgtgcataac 
tcgctgcgct 
aaaatggctg 
ctcgaccgcc 
aaacctctga 
gagcagacaa 
gacccagtca 
attgtactga 
taccgcatca 
ctgcggcgag 
gataacgcag 
gccgcgttgc 
cgctcaagtc 
ggaagctccc 
tttctccctt 
gtgtaggtcg 
tgcgccttat 
ctggcagcag 
ttcttgaagt 
ctgctgaagc 
accgctggta 
tctcaagaag 
cgttaaggga 
tattttattt 
tgttcttccc 
tccgccctgc 
atgttgctgt 
tttaaaaaat 
caatccacat 
aagctattcg 
gcatacagct 
acgccatcgg 
acctttggaa 
tcataggtgg 
cccaccagct 
ttttcgatca 
cctcttttct 
attcactgtt 
tttcaaagtt 
aggcagcaac 
tttcaaaccc 
ctgccgcctt 
gagtggtgat 
atattgtggt 
aatgtactga 
ttttaggaat 
gtttcttata 
gaactactca 
gacggggcgg 
cgtgcttgaa 
tgcgcacgct 
cctccaggga 

99ggggagac 

ggcccgcgta 
gctcccgcag 
agttgaccgt 
cctcggtggc 
agatagattt 
tccttatata 
gtggagatat 
acgatgctcc 
acgatagcct 
gtccttttga 
accctttgtt 
ttggagtaga 
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cgagagtgtc gtgctccacc atgttatcac atcaatccac ttgctttgaa gacgtggtfcg 462 0 
gaacgtcttc tttttccacg atgctcctcg tgggtggggg tccatctttg ggaccactgt 4680 
cggcagaggc atcttgaacg atagcctttc ctttatcgca atgatggcat ttgtaggtgc 474 0 
caccttcctt ttctactgtc cttttgatga agtgacagat agctgggcaa tggaatccga 4800 
ggaggtttcc cgatattacc ctttgttgaa aagtctcaat agccctttgg tcttctgaga 4860 
ctgtatcttt gatattcttg gagtagacga gagtgtcgtg ctccaccatg ttggcaagct 4920 
gctctagcca atacgcaaac cgcctctccc cgcgcgttgg ccgattcatt aatgcagctg 4980 
gcacgacagg tttcccgact ggaaagcggg cagtgagcgc aacgcaatta atgtgagtta 504 0 
gctcactcat taggcacccc aggctttaca ctttatgctt ccggctcgta tgttgtgtgg 5100 
aattgtgagc ggataacaat ttcacacagg aaacagctat gaccatgatt acgaattcga 5160 
gccttgacta gagggtcgac ggtatacaga catgataaga tacattgatg agtttggaca 5220 
aaccacaact agaafcgcagt gaaaaaaatg ctttatttgt gaaatttgtg atgctattgc 5280 
tttafcttgta accattataa gctgcaataa acaagttggg gtgggcgaag aactccagca 5340 
tgagatcccc gcgctggagg atcatccagc cggcgtcccg gaaaacgatt ccgaagccca 5400 
acctttcata gaaggcggcg gtggaatcga aatctcgtag cacgtgtcag tcctgctcct 5460 
cggccacgaa gtgcacgcag ttgccggccg ggtcgcgcag ggcgaactcc cgcccccacg 552 0 
gctgctcgcc gatctcggtc atggccggcc cggaggcgtc ccggaagttc gtggacacga 5580 
cctccgacca ctcggcgtac agctcgtcca ggccgcgcac ccacacccag gccagggtgt 564 0 
tgtccggcac cacctggtcc tggaccgcgc tgatgaacag ggtcacgtcg tcccggacca 5700 
caccggcgaa gtcgtcctcc acgaagtccc gggagaaccc gagccggtcg gtccagaact 5760 
cgaccgctcc ggcgacgtcg cgcgcggtga gcaccggaac ggcactggtc aacttggcca 5 82 0 
tggatccaga tttcgctcaa gttagtataa aaaagcaggc ttcaatcctg caggaattcg 5 880 
atcgacactc tcgtctactc caagaatatc aaagatacag tctcagaaga ccaaagggct 5 94 0 
attgagactt ttcaacaaag ggtaatatcg ggaaacctcc tcggattcca ttgcccagct 6 0 0O 
atctgtcact tcatcaaaag gacagtagaa aaggaaggtg gcacctacaa atgccatcat 606O 
tgcgataaag gaaaggctat cgttcaagat gcctctgccg acagtggtcc caaagatgga 612 0 
cccccaccca cgaggagcat cgtggaaaaa gaagacgttc caaccacgtc ttcaaagcaa 6180 
gtggattgat gtgataacat ggtggagcac gacactctcg tctactccaa gaatatcaaa 6240 
gatacagtct cagaagacca aagggctatt gagacttttc aacaaagggt aatatcggga 6 300 
aacctcctcg gattccattg cccagctatc tgtcacttca tcaaaaggac agtagaaaag 6360 
gaaggtggca cctacaaatg ccatcattgc gataaaggaa aggctatcgt tcaagatgcc 6420 
tctgccgaca gtggtcccaa agatggaccc ccacccacga ggagcatcgt ggaaaaagaa 6480 
gacgttccaa ccacgtcttc aaagcaagtg gattgatgtg atatctccac tgacgtaagg 654 0 
gatgacgcac aatcccacta tccttcgcaa gaccttcctc tatataagga agttcatttc 6600 
atttggagag gacacgctga aatcaccagt ctctctctac aaatctatct ctctcgagct 6660 
ttcgcagatc cgggggggca atgagatatg aaaaagcctg aactcaccgc gacgtctgtc 6720 
gagaagtttc tgatcgaaaa gttcgacagc gtctccgacc tgatgcagct ctcggagggc 6780 
gaagaatctc gtgctttcag cttcgatgta ggagggcgtg gatatgtcct, gcgggtaaat 6 840 
agctgcgccg atggtttcta caaagatcgt tatgtttatc ggcactttgc atcggccgcg 6900 
ctcccgattc cggaagtgct tgacattggg gagtttagcg agagcctgac ctattgcatc 6960 
tcccgccgtg cacagggtgt cacgttgcaa gacctgcctg aaaccgaact gcccgctgtt 7 020 
ctacaaccgg tcgcggaggc tatggatgcg atcgctgcgg ccgatcttag ccagacgagc 7080 
gggttcggcc cattcggacc gcaaggaatc ggtcaataca ctacatggcg tgatttcata 7140 
tgcgcgattg ctgatcccca tgtgtatcac tggcaaactg tgatggacga caccgtcagt 72 00 
gcgtccgtcg cgcaggctct cgatgagctg atgctttggg ccgaggactg ccccgaagtc 7260 
cggcacctcg tgcacgcgga tttcggctcc aacaatgtcc tgacggacaa tggccgcata 7320 
acagcggtca ttgactggag cgaggcgatg ttcggggatt cccaatacga ggtcgccaac 73 80 
atcttcttct ggaggccgtg gttggcttgt atggagcagc agacgcgcta cttcgagcgg 7440 
aggcatccgg agcttgcagg atcgccacga ctccgggcgt atatgctccg cattggtctt 75O0 
gaccaactct atcagagctt ggttgacggc aatttcgatg atgcagcttg ggcgcagggt 75 60 
cgatgcgacg caatcgtccg atccggagcc gggactgtcg ggcgtacaca aatcgcccgc 7620 
agaagcgcgg ccgtctggac cgatggctgt gtagaagtac tcgccgatag tggaaaccga 7680 
cgccccagca ctcgtccgag ggcaaagaaa tagagtagat gccgaccgga tctgtcgatc 774 0 
gacaagctcg agtttctcca taataatgtg tgagtagttc ccagataagg gaattagggt 7 800 
tcctataggg tttcgctcat gtgttgagca tataagaaac ccttagtatg tatttgtatt 7860 
tgtaaaatac ttctatcaat aaaatttcta attcctaaaa ccaaaatcca gtactaaaat 7920 
ccagatcccc cgaattaatt cggcgttaat tcagatcaag cttggcactg gccgtcgttt 7980 
tacaacgtcg tgactgggaa aaccctggcg ttacccaact taatcgcctt gcagcacatc 8040 
cccctttcgc cagctggcgt aatagcgaag aggcccgcac cgatcgccct tcccaacagt 8100 
tgcgcagcct gaatggcgaa tgctagagca gcttgagctt ggatcagatt gtcgtttccc 8160 
gccttcagtt tggggatcct ctagactgaa ggcgggaaac gacaatctga tcatgagcgg 822 0 
agaattaagg gagtcacgtt atgacccccg ccgatgacgc gggacaagcc gttttacgtt 8280 
tggaactgac agaaccgcaa cgttgaagga gccactcagc cgcgggtttc tggagtttaa 834 0 
tgagctaagc acatacgtca gaaaccatta ttgcgcgttc aaaagtcgcc taaggtcact 8400 
atcagctagc aaatatttct tgtcaaaaat gctccactga cgttccataa attcccctcg 8460 
gtatccaatt agagtctcat attcactctc aatccaaata atctgcaccg gatctcgaga 8520 
atcgaattcc cgcggccgcc atggtagatc tgactagtaa aggagaagaa cttttcactg 8580 
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gagttgtccc 
gtggagaggg 
ctggaaaact 
gcttttcaag 
agggatacgt 
ctgaagtcaa 
tcaaggagga 
tatacatcat 
acatcgaaga 
atggccctgt 
atcccaacga 
cacatggcat 
gtgaccagct 
aatcctgttg 
gtaataatta 
ccgcaattat 
ttatcgcgcg 
ggatatattg 
aagggcgtga 
ccc t cgggat 
acgttcagtg 
ggctgccgcc 
gaatacttgc 
gctgggctat 
gcacgcggcc 
cccggagctg 
gctagaccgc 
ggccggcgcg 
ccgcatggtg 
ccgcacccgg 
taccctcacc 
cgfcgaaagag 
gcgcagcgag 
attgaccgag 
aaaccgcacc 
atgatcgcgg 
gaaatcctgg 
gaagaaaccg 
catgcggtcg 
tgaaggttat 
atctagcccg 
a £T9S?c ag t gc 
tcgaccgccc 
tcgacggagc 
tgctgattcc 
tggttaagca 
gggcgatcaa 
tgcccattct 
gcacaaccgt 
ccgctgaaat 
cacaaacacg 
cagcctggca 
caccaagctg 
atacatcgcg 
cggctaaagg 
ccatgtgtgg 
gcaatggcac 
ccggtacaaa 
gccgcccagc 
gctgatcgaa 
aagccgccca 
acccgcgata 
cgagctggcg 
ccggccggca 
accgaatcca 
ccacacgttg 
gacctggtag 



aattcttgtt 
tgaaggtgat 
acctgttccg 
afcacccagafc 
gcaggagagg 
gtttgaggga 
cggaaacatc 
ggccgacaag 
cggcggcgtg 
ccttttacca 
aaagagagac 
ggatgaacta 
cgaatttccc 
ccggtcttgc 
acatgtaatg 
acatttaata 
cggtgtcatc 
gcgggtaaac 
aaaggtttat 
caaagtactt 
cagccgtctt 
ctgccctttt 
gactagaacc 
gcccgcgtca 
ggctgcacca 
gccaggatgc 
ctggcccgca 
ggcctgcgta 
ttgaccgtgt 
agcgggcgcg 
ccggcacaga 
gcggctgcac 
gaagtgacgc 
gccgacgccc 
aggacggcca 
ccgggtacgt 
ccggtttgtc 
agcgccgccg 
ctgcgtatat 
cgctgtactt 
cgccctgcaa 
ccgcgattgg 
gacgattgac 
gccccaggcg 
ggtgcagcca 
gcgcattgag 
aggcacgcgc 
tgagtcccgt 
tcttgaatca 
taaatcaaaa 
ctaagtgccg 
gacacgccag 
aagatgtacg 
cagctaccag 
aggcggcatg 
aggaacgggc 
tggaaccccc 
tcggcgcggc 
ggcaacgcat 
tccgcaaaga 
agggcgacga 
gtcgcagcat 
aggtgatccg 
tggccagtgt 
tgaaccgata 
cggacgtact 
aaacctgcat 



gaattagatg 
gcaacatacg 
tggccaacac 
catatgaagc 
accatcttct 
gacaccctcg 
ctcggccaca 
caaaagaacg 
caactcgctg 
gacaaccatt 
cacatggtcc 
tacaaagcta 
cgatcgttca 
gatgattatc 
catgacgtta 
cgcgatagaa 
tatgttacta 
ctaagagaaa 
ccgttcgtcc 
tgatccaacc 
ctgaaaacga 
cctggcgttt 
ggagacatta 
gcaccgacga 
agctgttttc 
ttgaccacct 
gcacccgcga 
gcctggcaga 
tcgccggcat 
aggccgccaa 
tcgcgcacgc 
tgcttggcgt 
ccaccgaggc 
tggcggccgc 
ggacgaaccg 
gttcgagccg 
fcgatgccaag 
tctaaaaagg 
gatgcgatga 
aaccagaaag 
ctcgccgggg 
gcggccgtgc 
cgcgacgtga 
gcggacttgg 
agcccttacg 
gtcacggatg 
atcggcggtg 
atcacgcagc 
gaacccgagg 
ctcatttgag 
gccgtccgag 
ccatgaagcg 
cggtacgcca 
agtaaatgag 
gaaaatcaag 
ggttggccag 
aagcccgagg 
gctgggtgat 
cgaggcagaa 
atcccggcaa 
gcaaccagat 
catggacgtg 
ctacgagctt 
gtgggattac 
ccgggaaggg 
caagttctgc 
tcggttaaac 



gtgatgttaa 
gaaaacttac 
ttgtcactac 
ggcacgactt 
tcaaggacga 
tcaacaggat 
agttggaata 
gcatcaaagc 
atcattatca 
acctgtccac 
ttcttgagtt 
gccaccacca 
aacatttggc 
atataatttc 
tttatgagat 
aacaaaatat 
gatcgggaat 
agagcgttta 
atttgtatgt 
cctccgctgc 
catgtcgcac 
tcttgtcgcg 
cgccatgaac 
ccaggacttg 
cgagaagatc 
acgccctggc 
cctactggac 
gccgtgggcc 
tgccgagttc 
ggcccgaggc 
ccgcgagctg 
gcatcgctcg 
caggcggcgc 
cgagaatgaa 
tttttcatta 
cccgcgcacg 
ctggcggcct 
tgatgtgtat 
gtaaataaac 
gcgggtcagg 
ccgatgttct 
gggaagatca 
aggccatcgg 
ctgtgtccgc 
acatatgggc 
gaaggctaca 
aggttgccga 
gcgtgagcta 
gcgacgctgc 
ttaatgaggt 
cgcacgcagc 
ggtcaacttt 
aggcaagacc 
caaatgaata 
aacaaccagg 
gcgtaagcgg 
aafccggcgtg 
gacctggtgg 
gcacgccccg 
ccgccggcag 
tttttcgttc 
gccgttttcc 
ccagacgggc 
gacctggtac 
aagggagaca 
cggcgagccg 
accacgcacg 



tgggcacaaa 
ccttaaattt 
tttctcttat 
cttcaagagc 
cgggaactac 
cgagcttaag 
caactacaac 
caacttcaag 
acaaaatact 
acaatctgcc 
tgtaacagct 
ccaccaccac 
aataaagttt 
tgttgaatta 
gggtttttat 
agcgcgcaaa 
taaactatca 
ttagaataac 
gcatgccaac 
tatagtgcag 
aagtcctaag 
tgttttagtc 
aagagcgccg 
accaaccaac 
accggcacca 
gacgttgtga 
attgccgagc 
gacaccacca 
gagcgttccc 
gtgaagtttg 
atcgaccagg 
accctgtacc 
ggtgccttcc 
cgccaagagg 
ccgaagagat 
tctcaaccgt 
ggccggccag 
ttgagtaaaa 
aaatacgcaa 
caagacgacc 
gfctagtcgat 
accgctaacc 
ccggcgcgac 
gatcaaggca 
caccgccgac 
agcggccttt 
ggcgctggcc 
cccaggcact 
ccgcgaggtc 
aaagagaaaa 
agcaaggctg 
cagt tgccgg 
attaccgagc 
aatgagtaga 
caccgacgcc 
ctgggttgtc 
acggtcgcaa 
agaagttgaa 
gtgaatcgtg 
ccggtgcgcc 
cgatgctcta 
gtctgtcgaa 
acgt agaggt 
tgatggcggt 
agcccggccg 
atggcggaaa 
ttgccatgca 



ttttctgtca 
atttgcacta 
ggtgttcaat 
gccatgcctg 
aagacacgtg 
ggaatcgatt 
tcccacaacg 
acccgccaca 
ccaattggcg 
ctttcgaaag 
gctgggatta 
gtgtgaattg 
cttaagattg 
cgttaagcat 
gattagagtc 
ctaggataaa 
gtgtttgaca 
ggatatttaa 
cacagggttc 
tcggcttctg 
ttacgcgaca 
gcataaagta 
ccgctggcct 
gggccgaact 
ggcgcgaccg 
cagtgaccag 
gcatccagga 
cgccggccgg 
taatcatcga 
gcccccgccc 
aaggccgcac 
gcgcacttga 
gtgaggacgc 
aacaagcatg 
cgaggcggag 
gcggctgcat 
cttggccgct 
cagcttgcgt 
ggggaacgca 
atcgcaaccc 
tccgatcccc 
gttgtcggca 
ttcgtagtga 
gccgacttcg 
ctggtggagc 
gtcgtgtcgc 
gggtacgagc 
gccgccgccg 
caggcgctgg 
tgagcaaaag 
caacgttggc 
cggaggatca 
tgctatctga 
tgaattttag 
gtggaatgcc 
tgccggccct 
accatccggc 
ggccgcgcag 
gcaagcggcc 
gtcgattagg 
tgacgtgggc 
gcgtgaccga 
ttccgcaggg 
ttcccatcta 
cgtgttccgt 
gcagaaagac 
gc 
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<210> 7 

<211> 3357 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pGEMEasyNOS Plasmid 



<400> 7 

tatcactagt 

tggatgcata 

tagctgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 

aaggccagga 

gacgagcatc 

agataccagg 

cttaccggat 

cgctgtaggt 

ccccccgttc 

gtaagacacg 

tatgtaggcg 

acagtatttg 

tcttgatccg 

attacgcgca 

gctcagtgga 

ttcacctaga 

taaacttggt 

ctatttcgtt 

ggcttaccat 

gatttatcag 

ttatccgcct 

gttaatagtt 

tttggtatgg 

atgttgtgca 

gccgcagtgt 

tccgtaagat 

atgcggcgac 

agaactttaa 

ttaccgctgt 

tcttttactt 

aagggaataa 

tgaagcattt 

aataaacaaa 

aataccgcac 

ttgttaaaat 

atcggcaaaa 

gtttggaaca 

gtctatcagg 

aggtgccgta 

ggaaagccgg 

gcgctggcaa 

ccgctacagg 

tgcgggcctc 

gttgggtaac 
aatacgactc 
gccgcgggaa 
gactctaatt 
atatttgcta 
gtatgtgctt 
ggttctgtca 
tgactccctt 



gaattcgcgg 
gcttgagtat 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 
ggagaggcgg 
cggtcgttcg 
cagaatcagg 
accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
aga tgcg t aa 
tcgcgttaaa 
tcccttataa 
agagtccact 
gcgatggccc 
aagcactaaa 
cgaacgtggc 
gtgtagcggt 
gcgcgtccat 
ttcgctatta 
gccagggttt 
actatagggc 
ttcgattctc 
ggataccgag 
gctgatagtg 
agctcattaa 
gttccaaacg 
aattctccgc 



ccgcctgcag 
tctatagtgt 
ttgttatccg 
gggtgcctaa 
gtcgggaaac 
tttgcgtatt 
gctgcggcga 
ggataacgca 
ggccgcgttg 
acgctcaagt 
tggaagctcc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
ggagaaaata 
tttttgttaa 
atcaaaagaa 
attaaagaac 
actacgtgaa 
tcggaaccct 
gagaaaggaa 
cacgctgcgc 
tcgccattca 
cgccagctgg 
tcccagtcac 
gaattgggcc 
gagatccggt 
gggaatttat 
accttaggcg 
actccagaaa 
taaaacggct 
tcatgatcag 



gtcgaccata 
cacctaaata 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 

gggcgctctt 

gcggtatcag 
ggaaagaaca 
ctggcgtttt 
cagaggtggc 
ctcgtgcgct 
tcgggaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggccgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
ccgcatcagg 
atcagctcat 
tagaccgaga 
gtggactcca 
ccatcaccct 
aaagggagcc 
gggaagaaag 
gtaaccacca 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
cgacgtcgca 
gcagattatt 
ggaacgtcag 
acttttgaac 
cccgcggctg 
tgtcccgcgt 
attgtcgttt 



tgggagagct 

gcttggcgta 
cacacaacat 
aactcacatt 
agctgcatta 
ccgcttcctc 
ctcactcaaa 
tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
aaattgtaag 
tttttaacca 
tagggttgag 
acgtcaaagg 
aatcaagttt 
cccgatttag 
cgaaaggagc 
cacccgccgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
tgctcccggc 
tggattgaga 
tggagcatfct 
gcgcaataat 
agtggctcct 
catcggcggg 
cccgccttca 



cccaacgcgt 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 
aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
teat age tea 
tgtgcacgaa 
gtccaacccg 
cagagegagg 
cactagaaga 
agttggtagc 
caagcagcag 

ggggtctgac 

aaaaaggatc 
tatatatgag 
agegatctgt 
gataegggag 
accggctcca 
tcctgcaact 
tagttcgeca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgeca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
tgcggtgtga 
cgttaatatt 
ataggecgaa 
tgttgttcca 
gegaaaaace 

tttggggtcg 
agettgaegg 
gggegctagg 
get taatgcg 
gggegategg 
aggegattaa 
agtgaattgt 
cgccatggcg 
gtgaatatga 
ttgacaagaa 
ggtttctgac 
tcaacgttgc 
ggtcataacg 
gtctaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3357 



<210> 8 



WO 2002/096923 



PCT/US2002/0 17451 



-14- 



<211> 10122 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> p!302NOS Plasmid 



<400> 8 

catggtagat 

tgaattagat 

tgcaacatac 

gtggccaaca 

tcatatgaag 

gaccatcttc 

agacaccctc 

cctcggccac 

gcaaaagaac 

gcaactcgct 

agacaaccat 

ccacatggtc 

atacaaagct 

ccgatcgttc 

cgatgattat 

gcatgacgtt 

acgcgataga 

ctatgttact 

cctaagagaa 

tccgttcgtc 

ttgatccaac 

tctgaaaacg 

tcctggcgtt 

cggagacatt 

agcaccgacg 

aagctgtttt 

cttgaccacc 

agcacccgcg 

agcctggcag 

ttcgccggca 

gaggccgcca 

atcgcgcacg 

ctgcttggcg 

cccaccgagg 

ctggcggccg 

aggacgaacc 

tgttcgagcc 

ctgatgccaa 

gtctaaaaag 

tgatgcgatg 

taaccagaaa 

actcgccggg 

ggcggccgtg 

ccgcgacgtg 

ggcggacttg 

aagcccttac 

ggtcacggat 

catcggcggt 

tatcacgcag 

agaacccgag 

actcatttga 

ggccgtccga 

gccatgaagc 

gcggtacgcc 

gagtaaatga 

ggaaaatcaa 

cggttggcca 

caagcccgag 

cgctgggtga 



ctgactagta 
ggtgatgtfca 
ggaaaactta 
cttgtcacta 
cggcacgact 
ttcaaggacg 
gtcaacagga 
aagttggaat 
ggcatcaaag 
gatcattatc 
tacctgtcca 
cttcttgagt 
agccaccacc 
aaacatttgg 
catataattt 
atttatgaga 
aaacaaaata 
agatcgggaa 
aagagcgttt 
catttgtatg 
ccctccgctg 
acatgtcgca 
ttcttgtcgc 
acgccatgaa 
accaggactt 
ccgagaagat 
tacgccctgg 
acctactgga 
agccgtgggc 
ttgccgagtt 
aggcccgagg 
cccgcgagct 
tgcatcgctc 
ccaggcggcg 
ccgagaatga 
gtttttcatt 
gcccgcgcac 
gctggcggcc 
gtgatgtgta 
agtaaataaa 
ggcgggtcag 
gccgatgttc 
c'gggaagatc 
aaggccatcg 
gctgtgtccg 
gacatatggg 
ggaaggctac 
gaggttgccg 
cgcgtgagct 
ggcgacgctg 
gtfcaafcgagg 
gcgcacgcag 
gggtcaactt 
aaggcaagac 
gcaaatgaat 
gaacaaccag 
ggcgtaagcg 
gaatcggcgt 
tgacctggtg 



aaggagaaga 
atgggcacaa 
cccttaaatt 
ctttctctta 
tcttcaagag 
acgggaacta 
tcgagcttaa 
acaactacaa 
ccaacttcaa 
aacaaaatac 
cacaatctgc 
ttgtaacagc 
accaccacca 
caataaagtt 
ctgttgaatt 
tgggttttta 
tagcgcgcaa 
ttaaactatc 
attagaataa 
tgcatgccaa 
ctatagtgca 
caagtcctaa 
gtgttttagt 
caagagcgcc 
gaccaaccaa 
caccggcacc 
cgacgttgtg 
cattgccgag 
cgacaccacc 
cgagcgttcc 
cgtgaagttt 
gatcgaccag 
gaccctgtac 
cggtgccttc 
acgccaagag 
accgaagaga 
gtctcaaccg 
tggccggcca 
tttgagtaaa 
caaatacgca 
gcaagacgac 
tgttagtcga 
aaccgctaac 
gccggcgcga 
cgatcaaggc 
ccaccgccga 
aagcggcctt 
aggcgctggc 
acccaggcac 
cccgcgaggt 
taaagagaaa 
cagcaaggct 
tcagttgccg 
cattaccgag 
aaatgagtag 
gcaccgacgc 
gctgggttgt 
gacggtcgca 
gagaagttga 



acttttcact 
attttctgtc 
tatttgcact 
tggtgt tcaa 
cgccatgcct 
caagacacgt 
gggaatcgat 
ctcccacaac 
gacccgccac 
tccaattggc 
cctttcgaaa 
tgctgggatt 
cgtgtgaatt 
tcttaagatt 
acgttaagca 
tgattagagt 
actaggataa 
agtgtttgac 
cggatattta 
ccacagggtt 
gtcggcttct 
gttacgcgac 
cgcataaagt 
gccgctggcc 
cgggccgaac 
aggcgcgacc 
acagtgacca 
cgcatccagg 
acgccggccg 
ctaatcatcg 
ggcccccgcc 
gaaggccgca 
cgcgcacttg 
cgtgaggacg 
gaacaagcat 
tcgaggcgga 
tgcggctgca 
gcttggccgc 
acagcttgcg 
aggggaacgc 
catcgcaacc 
ttccgatccc 
cgttgtcggc 
cttcgtagtg 
agccgacttc 
cctggtggag 
tgtcgtgtcg 
cgggfc acgag 
tgccgccgcc 
ccaggcgctg 
atgagcaaaa 
gcaacgttgg 
gcggaggatc 
ctgctatctg 
atgaatttta 
cgtggaatgc 
ctgccggccc 
aaccatccgg 
aggccgcgca 



ggagttgtcc 

agtggagagg 

actggaaaac 

tgcttttcaa 

gagggatacg 

gctgaagtca 

ttcaaggagg 

gtatacatca 

aacatcgaag 

gatggccctg 

gatcccaacg 

acacatggca 

ggtgaccagc 

gaatcctgtt 

tgtaataatt 

cccgcaatta 

attatcgcgc 

aggatatatt 

aaagggcgtg 

cccctcggga 

gacgttcagt 

a 99 ct gccgc 

agaatacttg 

tgctgggcta" 

tgcacgcggc 

gcccggagct 

ggctagaccg 

aggccggcgc 

gccgcatggt 

accgcacccg 

ctaccctcac 

ccgtgaaaga 

agcgcagcga 

cattgaccga 

gaaaccgcac 

gatgatcgcg 

tgaaatcctg 

tgaagaaacc 

tcatgcggtc 

atgaaggtta 

catctagccc 

cagggcagtg 

atcgaccgcc 

atcgacggag 

gtgctgattc 

ctggttaagc 

cgggcgatca 

ctgcccattc 

ggcacaaccg 

gccgctgaaa 

gcacaaacac 

ccagcctggc 

acaccaagct 

aatacatcgc 

gcggctaaag 

cccatgtgtg 

tgcaatggca 

cccggtacaa 

ggccgcccag 



caattcttgt 
gtgaaggtga 
tacctgttcc 
gatacccaga 
tgcaggagag 
agtttgaggg 
acggaaacat 
tggccgacaa 
acggcggcgt 
tccttttacc 
aaaagagaga 
tggatgaact 
tcgaatttcc 
gccggtcttg 
aacatgtaat 
tacatttaat 
gcggtgtcat 
99cgggtaaa 
aaaaggttta 
tcaaagtact 
gcagccgtct 
cctgcccttt 
cgactagaac 
tgcccgcgtc 
cggctgcacc 
ggccaggatg 
cctggcccgc 
gggcctgcgt 
gttgaccgtg 
gagcgggcgc 
cccggcacag 
ggcggctgca 
ggaagtgacg 
ggccgacgcc 
caggacggcc 
gccgggtacg 
gccggtttgt 
gagcgccgcc 
gctgcgtata 
tcgctgtact 
gcgccctgca 
cccgcgattg 
cgacgattga 
cgccccaggc 
cggtgcagcc 
agcgcattga 
aaggcacgcg 
ttgagtcccg 
ttcttgaatc 
ttaaatcaaa 
gctaagtgcc 
agacacgcca 
gaagatgtac 
gcagctacca 
gaggcggcat 
gaggaacggg 
ctggaacccc 
atcggcgcgg 
cggcaacgca 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 
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tcgaggcaga agcacgcccc ggtgaatcgt 
aatcccggca accgccggca gccggtgcgc 
agcaaccaga ttttttcgtt ccgatgctct 
tcatggacgt ggccgttttc cgtctgtcga 
gctacgagct tccagacggg cacgtagagg 
tgtgggatta cgacctggta ctgatggcgg 
accgggaagg gaagggagac aagcccggcc 
tcaagttctg ccggcgagcc gatggcggaa 
ttcggttaaa caccacgcac gttgccatgc 
tggtgacggt atccgagggt gaagccttga 
ccgggcggcc ggagtacatc gagatcgagc 
aaggcaagaa cccggacgtg ctgacggttc 
tcggccgttt tctctaccgc ctggcacgcc 
tgttcaagac gatctacgaa cgcagtggca 
ccgtgcgcaa gctgatcggg tcaaatgacc 
ggcaggctgg cccgatccta gtcatgcgct 
ccggttccta atgtacggag cagatgctag 
gaaaaggtct ctttcctgtg gatagcacgt 
accggaaccc gtacattggg aacccaaagc 
tgactgatat aaaagagaaa aaaggcgatt 
aaactcttaa aacccgcctg gcctgtgcat 
tgcaaaaagc gcctaccctt cggtcgctgc 
ctatcgcggc cgctggccgc tcaaaaatgg 
gcggacaagc cgcgccgtcg ccactcgacc 
gcgcgtttcg gtgatgacgg tgaaaacctc 
gcttgtctgt aagcggatgc cgggagcaga 
ggcgggtgtc ggggcgcagc catgacccag 
ttaactatgc ggcatcagag cagattgtac 
cgcacagatg cgtaaggaga aaataccgca 
actcgctgcg ctcggtcgtt cggctgcggc 
tacggttatc cacagaatca ggggataacg 
aaaaggccag gaaccgtaaa aaggccgcgt 
ctgacgagca tcacaaaaat cgacgctcaa 
aaagatacca ggcgtttccc cctggaagct 
cgcttaccgg atacctgtcc gcctttctcc 
cacgctgtag gtatctcagt tcggtgtagg 
aaccccccgt tcagcccgac cgctgcgcct 
cggtaagaca cgacttatcg ccactggcag 
ggtatgtagg cggtgctaca gagttcttga 
ggacagtatt tggtatctgc gctctgctga 
gctcttgatc cggcaaacaa accaccgctg 
agattacgcg cagaaaaaaa ggatctcaag 
acgctcagtg gaacgaaaac tcacgttaag 
acaattcatc cagtaaaata taatatttta 
agtcaaaaaa tagctcgaca tactgttctt 
agaaggcaat gtcataccac ttgtccgccc 
ttactttgcc atctttcaca aagatgttgc 
gttcctcttc gggcttttcc gtctttaaaa 
gagtgtcttc ttcccagttt tcgcaatcca 
ccaattcggc taagcggctg tctaagctat 
agtgaaagag cctgatgcac tccgcataca 
cttcatactc ttccgagcaa aggacgccat 
catcatgccg ttcaaagtgc aggacctttg 
tcatgtcctt ttcccgttcc acatcatagg 
ttaaatatag gttttcattt tctcccacca 
ccgtatcttt tacgcagcgg tatttttcga 
ttttagccat fctattatttc cttcctcttt 
taattataac aagacgaact ccaattcact 
gaaaacagct ttttcaaagt tgttttcaaa 
gattttgaaa ccgcggtgat cacaggcagc 
taccctccgc gagatcatcc gtgtttcaaa 
agcatcggta acatgagcaa agtctgccgc 
cggactgatg ggctgcctgt atcgagtggt 
tgttggctgg ctggtggcag gatatattgt 
aataacacat tgcggacgtt tttaatgtac 
tggattttag tactggattt tggttttagg 
acaaatacaa atacatacta agggtttctt 



-15- 

ggcaagcggc cgctgatcga atccgcaaag 3 6 00 
cgtcgattag gaagccgccc aagggcgacg 3 660 
atgacgtggg cacccgcgat agtcgcagca 3 720 
agcgtgaccg acgagctggc gaggtgatcc 3 7 80 
tttccgcagg gccggccggc atggccagtg 3 840 
tttcccatct aaccgaatcc atgaaccgat 3 900 
gcgtgttccg tccacacgtt gcggacgtac 3 960 
agcagaaaga cgacctggta gaaacctgca 4 020 
agcgtacgaa gaaggccaag aacggccgcc 4 0 80 
ttagccgcta caagatcgta aagagcgaaa 414 0 
tagctgattg gatgtaccgc gagatcacag 42 00 
accccgatta ctttttgatc gatcccggca 4260 
gcgccgcagg caaggcagaa gccagatggt 4 32 0 
gcgccggaga gttcaagaag ttctgtttca 4 3 80 
tgccggagta cgatttgaag gaggaggcgg 444 0 
accgcaacct gatcgagggc gaagcatccg 4500 
ggcaaattgc cctagcaggg gaaaaaggtc 4560 
acattgggaa cccaaagccg tacattggga 4 620 
cgtacattgg gaaccggtca cacatgtaag 4 6 80 
tttccgccta aaactcttta aaacttatta 4740 
aactgtctgg ccagcgcaca gccgaagagc 4 800 
gctccctacg ccccgccgct tcgcgtcggc 4 860 
ctggcctacg gccaggcaat ctaccagggc 4 920 
gccggcgccc acatcaaggc accctgcctc 4 980 
tgacacatgc agctcccgga gacggtcaca 5 040 
caagcccgtc agggcgcgtc agcgggtgtt 5100 
tcacgtagcg atagcggagt gtatactggc 5160 
tgagagtgca ccatatgcgg tgtgaaatac 5220 
tcaggcgctc ttccgcttcc tcgctcactg 52 80 
gagcggtatc agctcactca aaggcggtaa 534 0 
caggaaagaa catgtgagca aaaggccagc 54 00 
tgctggcgtt tttccatagg ctccgccccc 5460 
gtcagaggtg gcgaaacccg acaggactat 552 0 
ccctcgtgcg ctctcctgtt ccgaccctgc 5580 
cttcgggaag cgtggcgctt tctcatagct 564 0 
tcgttcgctc caagctgggc tgtgtgcacg 5700 
tatccggtaa ctatcgtctt gagtccaacc 5760 
cagccactgg taacaggatt agcagagcga 582 0 
agtggtggcc taactacggc tacactagaa 5880 
agccagttac cttcggaaaa agagttggta 5 940 
gtagcggtgg tttttttgtt tgcaagcagc 6000 
aagatccttt gatcttttct acggggtctg 6060 
ggattttggt catgcattct aggtactaaa 6120 
ttttctccca atcaggcttg atccccagta 6180 
ccccgatatc ctccctgatc gaccggacgc 6240 
tgccgcttct cccaagatca ataaagccac 6300 
tgtctcccag gtcgccgtgg gaaaagacaa 63 60 
aatcatacag cfccgcgcgga tctttaaatg 6420 
catcggccag atcgttattc agtaagtaat 6480 
tcgtataggg acaatccgat atgtcgatgg 654 0 
gctcgataat cttttcaggg ctttgttcat 6600 
cggcctcact catgagcaga ttgctccagc 6660 
gaacaggcag ctttccttcc agccatagca 6720 
tggtcccttt ataccggctg tccgtcattt 6780 
gcttatatac cttagcagga gacattcctt 6840 
tcagtttttt caattccggt gatattctca 6900 
tctacagtat ttaaagatac cccaagaagc 6960 
gttccttgca ttctaaaacc ttaaatacca 7020 
gttggcgtat aacatagtat cgacggagcc 7080 
aacgctctgt catcgttaca atcaacatgc 7140 
cccggcagct tagttgccgt tcttccgaat 7200 
cttacaacgg ctctcccgct gacgccgtcc 7260 
gattttgtgc cgagctgccg gtcggggagc 73 20 
ggtgtaaaca aattgacgct tagacaactt 73 80 
tgaattaacg ccgaattaat tcgggggatc 7440 
aattagaaat tttattgata gaagtatttt 7500 
atatgctcaa cacatgagcg aaaccctata 7560 
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ggaaccctaa 
gtcgatcgac 
gcgtcggtfct: 
tctgcgggcg 
tcgaccctgc 
gtcaagacca 
cctccgctcg 
gatgttggcg 
tgttatgcgg 
ccggacttcg 
cgcactgacg 
gcatatgaaa 
cccgctcgtc 
tagaacagcg 
ggagatgcaa 
gagcgcggcc 
gctafcfctacc 
ttcgccctcc 
ctcgacagac 
gaaagctcga 
aatgaaatga 
atcccttacg 
gtcttctttt 
agaggcatct 
ttccttttct 
gtttcccgat 
atctttgata 
cacttgcttt 
gggtccatct 
gcaatgatgg 
gatagctggg 
aatagccctt 
gtgctccacc 
tggccgattc 
cgcaacgcaa 
cttccggctc 
tatgaccatg 
aacgacaatc 
cgcgggacaa 
agccgcgggt 
ttcaaaagtc 
tgacgttcca 
ataatctgca 



ttcccttatc 
agatccggtc 
ccactatcgg 
atttgtgtac 
gcccaagctg 
atgcggagca 
aagtagcgcg 
acctcgtatt 
ccattgtccg 
gggcagtcct 
gtgtcgtcca 
tcacgccatg 
tggctaagat 
ggcagttcgg 
taggtcaggc 
gatgcaaagt 
cgcaggacat 
gagagctgca 
gtcgcggtga 
gagagataga 
acttccttat 
tcagtggaga 
tccacgatgc 
tgaacgatag 
actgtccttt 
attacccttt 
ttcttggagt 
gaagacgtgg 
ttgggaccac 
catttgtagg 
caatggaatc 
tggtcttctg 
atgttggcaa 
attaatgcag 
ttaatgtgag 
gtatgttgtg 
attacgaatt 
tgatcatgag 
gccgttttac 
ttctggagtt 
gcctaaggtc 
taaattcccc 
ccggatctcg 



tgggaactac 
ggcatctact 
cgagtacttc 
gcccgacagt 
catcatcgaa 
tatacgcccg 
tctgctgctc 
gggaatcccc 
tcaggacatt 
cggcccaaag 
tcacagttfcg 
tagtgtattg 
cggccgcagc 
tttcaggcag 
tctcgctaaa 
gccgataaac 
atccacgccc 
tcaggtcgga 
gttcaggctt 
tttgtagaga 
atagaggaag 
tatcacatca 
tcctcgtggg 
cctttccttt 
tgatgaagtg 
gttgaaaagt 
agacgagagt 
ttggaacgtc 
tgtcggcaga 
tgccaccttc 
cgaggaggtt 
agactgtatc 
gctgctctag 
ctggcacgac 
ttagctcact 
tggaattgtg 
cgagctcggt 
cggagaatta 
gtttggaact 
taatgagcta 
actatcagct 
tcggtatcca 
agaatcgaat 
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tcacacatta 
ctatttcttt 
tacacagcca 
cccggctccg 
attgccgtca 
gagtcgtggc 
catacaagcc 
gaacatcgcc 
gttggagccg 
catcagctca 
ccagtgatac 
accgattcct 
gatcgcatcc 
gtcttgcaac 
ctccccaatg 
ataacgatct 
tcctacatcg 
gacgctgtcg 
tttcatatct 
gagactggtg 
gtcttgcgaa 
atccacttgc 
tgggggtcca 
atcgcaatga 
acagatagct 
ctcaatagcc 
gtcgtgctcc 
ttctttttcc 
ggcatcttga 
cttttctact 
tcccgatatt 
tttgatattc 
ccaatacgca 
aggtttcccg 
cattaggcac 
agcggataac 
acccggggat 
agggagtcac 
gacagaaccg 
agcacatacg 
agcaaatatt 
attagagtct 
tcccgcggcc 



ttatggagaa 
gccctcggac 
tcggtccaga 
gatcggacga 
accaagctct 
gatcctgcaa 
aaccacggcc 
tcgctccagt 
aaatccgcgt 
tcgagagcct 
acatggggat 
tgcggtccga 
atagcctccg 
gtgacaccct 
tcaagcactt 
ttgtagaaac 
aagctgaaag 
aacttttcga 
cattgccccc 
atttcagcgt 
ggatagtggg 
tttgaagacg 
tctttgggac 
tggcatttgt 
gggcaatgga 
ctttggtctt 
accatgttat 
acgatgctcc 
acgatagcct 
gtccttttga 
accctttgtt 
ttggagtaga 
aaccgcctct 
actggaaagc 
cccaggcttt 
aatttcacac 
cctctagact 
gttatgaccc 
caacgttgaa 
tcagaaacca 
tcttgtcaaa 
catattcact 
gc 



<210> 9 
<211> 621 
<212> DNA 

<213> Artificial Sequence 



actcgagctt 
gagtgctggg 
cggccgcgct 
ttgcgtcgca 
gatagagttg 
gctccggatg 
tccagaagaa 
caatgaccgc 
gcacgaggtg 
gcgcgacgga 
cagcaatcgc 
atgggccgaa 
cgaccggttg 
gtgcacggcg 
ccggaatcgg 
catcggcgca 
cacgagattc 
tcagaaactt 
ccggatctgc 
gtcctctcca 
attgtgcgtc 
tggttggaac 
cactgtcggc 
aggtgccacc 
a t ccgaggag 
ctgagactgt 
cacatcaatc 
tcgtgggtgg 
ttcctttatc 
tgaagtgaca 
gaaaagtctc 
cgagagtgtc 
ccccgcgcgt 
gggcagtgag 
acactttatg 
aggaaacagc 
gaaggcggga 
ccgccgatga 
ggagccactc 
ttattgcgcg 
aatgctccac 
ctcaatccaa 



7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 

9540 

9600 

9660 

9720 

9780 

9840 

9900 

9960 

10020 

10080 

10122 



<220> 
<223> N. 



tabacum rDNA intergnic spacer (IGS) sequence 



<300> 

<308> Genbank #Y08422 
<309> 1997-10-31 



<4 00> 9 

gtgctagcca 

gctggcggtg 

tgcagcggtg 

gttattggtg 

ttacatattt 

tgttttataa 

ttctccattg 

attttttcgt 

tttacaatgt 

tttggtgttg 



atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 



agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 



cacaatgaat 
gagcggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 



gttggtggtt 
gatcggcgat 
cacaatggag 
ttaagtattt 
taaatagttt 
tacttgatgt 
ttttttttgt 
tatttttgta 
gttgtacttc 
catgtctact 



ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 
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gggttttttt ttttaagaca t 



621 



<210> 10 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> PCR Primer NTIGS-F1 
<400> 10 

gtgctagcca atgtttaaca agatg 25 

<210> 11 

<211> 28 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> PCR Primer NTIGS-RI 



<210> 12 

<211> 233 

<212> DNA 

<213> Mus musculus 

<300> 

<308> Genbank #V00846 
<309> 1989-07-06 

<400> 12 

gacctggaat atggcgagaa aactgaaaat cacggaaaat gagaaataca cactttagga 60 

cgtgaaatat ggcgaggaaa actgaaaaag gtggaaaatt tagaaatgtc cactgtagga 12 0 

cgtggaatat ggcaagaaaa ctgaaaatca tggaaaatga gaaacatcca cttgacgact 180 

tgaaaaatga cgaaatcact aaaaaacgtg aaaaatgaga aatgcacact gaa 23 3 

<210> 13 

<211> 31 

<212> DNA 

<213 > Artificial Sequence 
<220> 

<223> Primer MSAT-F1 



<210> 14 
<211> 27 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Primer MSAT-RI 
<400> 14 

ataaccgcgg agtccttcag tgtgcat 27 

<210> 15 
<211> 277 
<212> DNA 

<213> Artificial Sequence 



<400> 11 

atgtcttaaa aaaaaaaacc 



caagtgac 



28 



<400> 13 

aataccgcgg aagcttgacc tggaatatcg c 



31 



<220> 
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<223> Nopaline Synthase Promoter Fragment 
<300> 

<308> Genebank #U0 936 5 
<309> 1997-10-17 

<400> .15 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 

tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 

aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 

attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 240 

gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 16 

<211> 1812 

<212> DNA 

<213> Escherichia coli 

<220> 

<221> CDS 

<222> (1) . . . (1812) 

<223> Be t a - glucuronidase 

<300> 

<308> Genbank #S69414 
<309> 1994-09-23 

<400> 16 

atg tta cgt cct gta gaa acc cca acc cgt gaa ate aaa aaa etc gac 4 8 

Met Leu Arg Pro Val Glu Thr Pro Thr Arg Glu lie Lys Lys Leu Asp 

15 10 * "* 15 

ggc ctg tgg gca ttc agt ctg gat cgc gaa aac tgt gga att gat cag 96 
Gly Leu Trp Ala Phe Ser Leu Asp Arg Glu Asn Cys Gly lie Asp Gin 
20 25 30 

cgt tgg tgg gaa age gcg tta caa gaa age egg gca att get gtg cca 144 
Arg Trp Trp Glu Ser Ala Leu Gin Glu Ser Arg Ala lie Ala Val Pro 
35 40 45 

ggc agt ttt aac gat cag ttc gee gat gca gat att cgt aat tat gcg 192 
Gly Ser Phe Asn Asp Gin Phe Ala Asp Ala Asp lie Arg Asn Tyr Ala 
50 55 60 

ggc aac gtc tgg tat cag cgc gaa gtc ttt ata ccg aaa ggt tgg gca 24 0 
Gly Asn Val Trp Tyr Gin Arg Glu Val Phe lie Pro Lys Gly Trp Ala 
65 70 75 "* 80 

ggc cag cgt ate gtg ctg cgt ttc gat gcg gtc act cat tac ggc aaa 28 8 
Gly Gin Arg lie Val Leu Arg Phe Asp Ala Val Thr His Tyr Gly Lys 
85 90 95 

gtg tgg gtc aat aat cag gaa gtg atg gag cat cag ggc ggc tat acg 33 6 
val Trp Val Asn Asn Gin Glu Val Met Glu His Gin Gly Gly Tyr Thr 
100 105 110 

cca ttt gaa gec gat gtc acg ccg tat gtt att gee ggg aaa agt gta 384 
Pro Phe Glu Ala Asp Val Thr Pro Tyr Val lie Ala Gly Lys Ser Val 
115 120 125 

cgt ate acc gtt tgt gtg aac aac gaa ctg aac tgg cag act ate ccg 43 2 
Arg lie Thr Val Cys Val Asn Asn Glu Leu Asn Trp Gin Thr lie Pro 
130 135 140 

ccg gga atg gtg att acc gac gaa aac ggc aag aaa aag cag tct tac 4 80 
Pro Gly Met Val lie Thr Asp Glu Asn Gly Lys Lys Lys Gin Ser Tyr 
145 150 155 160 
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ttc cat gat ttc ttt aac tat gcc gga ate cat cgc age gta atg etc 528 
Phe His Asp Phe Phe Asn Tyr Ala Gly He His Arg Ser Val Met Leu 
165 170 175 

tac acc acg ccg aac ace tgg gtg gac gat ate ace gtg gtg acg cat 576 
Tyr Thr Thr Pro Asn Thr Trp Val Asp Asp He Thr Val Val Thr His 
180 185 190 

gtc gcg caa gac tgt aac cac gcg tct gtt gac tgg cag gtg gtg gcc 624 
Val Ala Gin Asp Cys Asn His Ala Ser Val Asp Trp Gin Val Val Ala 
195 200 205 

aat ggt gat gtc age gtt gaa ctg cgt gat gcg gat caa cag gtg gtt 672 
Asn Gly Asp Val Ser Val Glu Leu Arg Asp Ala Asp Gin Gin Val Val 
210 215 220 

gca act gga caa ggc act age ggg act ttg caa gtg gtg aat ccg cac 72 0 
Ala Thr Gly Gin Gly Thr Ser Gly Thr Leu Gin Val Val Asn Pro His 
225 230 235 240 

etc tgg caa ccg ggt gaa ggt tat etc tat gaa ctg tgc gtc aca gcc 768 
Leu Trp Gin Pro Gly Glu Gly Tyr Leu Tyr Glu Leu Cys Val Thr Ala 
245 250 255 

aaa age cag aca gag tgt gat ate tac ccg ctt cgc gtc ggc ate egg 816 
Lys Ser Gin Thr Glu Cys Asp lie Tyr Pro Leu Arg Val Gly lie Arg 
260 265 270 

tea gtg gca gtg aag ggc gaa cag ttc ctg att aac cac aaa ccg ttc 864 
Ser Val Ala Val Lys Gly Glu Gin Phe Leu lie Asn His Lys Pro Phe 
275 280 285 

tac ttt act ggc ttt ggt cgt cat gaa gat gcg gac ttg cgt ggc aaa 912 
Tyr Phe Thr Gly Phe Gly Arg His Glu Asp Ala Asp Leu Arg Gly Lys 
290 295 300 

gga ttc gat aac gtg ctg atg gtg cac gac cac gca tta atg gac tgg 960 
Gly Phe Asp Asn Val Leu Met Val His Asp His Ala Leu Met Asp Trp 
305 310 315 320 

att ggg gcc aac tec tac cgt acc teg cat tac cct tac get gaa gag 10 0 8 
lie Gly Ala Asn Ser Tyr Arg Thr Ser His Tyr Pro Tyr Ala Glu Glu 
325 330 335 

atg etc gac tgg gca gat gaa cat ggc ate gtg gtg att gat gaa act 1056 
Met Leu Asp Trp Ala Asp Glu His Gly He Val Val lie Asp Glu Thr 
340 345 350 

get get gtc ggc ttt aac etc tct tta ggc att ggt ttc gaa gcg ggc 1104 
Ala Ala Val Gly Phe Asn Leu Ser Leu Gly He Gly Phe Glu Ala Gly 
355 360 365 

aac aag ccg aaa gaa ctg tac age gaa gag gca gtc aac ggg gaa act 1152 
Asn Lys Pro Lys Glu Leu Tyr Ser Glu Glu Ala Val Asn Gly Glu Thr 
370 375 380 

cag caa gcg cac tta cag gcg att aaa gag ctg ata gcg cgt gac aaa 12 0 0 
Gin Gin Ala His Leu Gin Ala He Lys Glu Leu He Ala Arg Asp Lys 
385 390 395 400 

aac cac cca age gtg gtg atg tgg agt att gcc aac gaa ccg gat acc 124 8 
Asn His Pro Ser Val Val Met Trp Ser He Ala Asn Glu Pro Asp Thr 
405 410 415 

cgt ccg caa ggt gca egg gaa tat ttc gcg cca ctg gcg gaa gca acg 1296 
Arg Pro Gin Gly Ala Arg Glu Tyr Phe Ala Pro Leu Ala Glu Ala Thr 



WO 2002/096923 PCT/US2002/01 7451 

-20- 

420 425 430 

cgt aaa etc gac ccg acg cgt ccg ate acc tgc gtc aat gta atg ttc 1344 
Arg Lys Leu Asp Pro Thr Arg Pro lie Thr Cys Val Asn Val Met Phe 
435 440 445 

tgc gac get cac acc gat acc ate age gat etc ttt gat gtg ctg tgc 13 92 
Cys Asp Ala His Thr Asp Thr lie Ser Asp Leu Phe Asp Val Leu Cys 
450 455 460 

ctg aac cgt tat tac gga tgg tat gtc caa age ggc gat ttg gaa acg 144 0 
Leu Asn Arg Tyr Tyr Gly Trp Tyr Val Gin Ser Gly Asp Leu Glu Thr 
465 ~ * 470 475 480 

gca gag aag gta ctg gaa aaa gaa ctt ctg gec tgg cag gag aaa ctg 14 88 
Ala Glu Lys Val Leu Glu Lys Glu Leu Leu Ala Trp Gin Glu Lys Leu 
485 490 495 

cat cag ccg att ate ate acc gaa tac ggc gtg gat acg tta gee ggg 153 6 
His Gin Pro lie lie He Thr Glu Tyr Gly Val Asp Thr Leu Ala Gly 
500 505 510 

ctg cac tea atg tac acc gac atg tgg agt gaa gag tat cag tgt gca 1584 
Leu His Ser Met Tyr Thr Asp Met Trp Ser Glu Glu Tyr Gin Cys Ala 
515 520 525 

tgg ctg gat atg tat cac cgc gtc ttt gat cgc gtc age gee gtc gtc 1632 
Trp Leu Asp Met Tyr His Arg Val Phe Asp Arg Val Ser Ala Val Val 
530 535 540 

ggt gaa cag gta tgg aat ttc gee gat ttt gcg acc teg caa ggc ata 1680 
Gly Glu Gin Val Trp Asn Phe Ala Asp Phe Ala Thr Ser Gin Gly He 
545 550 555 560 

ttg cgc gtt ggc ggt aac aag aaa ggg ate ttc act cgc gac cgc aaa 172 8 
Leu Arg Val Gly Gly Asn Lys Lys Gly lie Phe Thr Arg Asp Arg Lys 
565 570 575 

ccg aag teg gcg get ttt ctg ctg caa aaa cgc tgg act ggc atg aac 17 7 6 
Pro Lys Ser Ala Ala Phe Leu Leu Gin Lys Arg Trp Thr Gly Met Asn 
580 585 590 

ttc ggt gaa aaa ccg cag cag gga ggc aaa caa tga 1812 
Phe Gly Glu Lys Pro Gin Gin Gly Gly Lys Gin * 
595 600 

<210> 17 
<211> 603 
<212> PRT 

<213> Escherichia coli 



<300> 

<308> Genbank #S69414 
<309> 1994-09-23 



<400> 17 



Met 


Leu 


Arg 


Pro 


Val 


Glu 


Thr 


Pro 


Thr 


Arg 


Glu 


lie 


Lys 


Lys 


Leu 


Asp 


1 






5 










10 










15 




Gly 


Leu 


Trp 


Ala 


Phe 


Ser 


Leu 


Asp 


Arg 


Glu 


Asn 


Cys 


Gly 


He 


Asp 


Gin 








20 










25 










30 






Arg 


Trp 


Trp 


Glu 


Ser 


Ala 


Leu 


Gin 


Glu 


Ser 


Arg 


Ala 


lie 


Ala 


Val 


Pro 


35 










40 










45 








Gly 


Ser 


Phe 


Asn 


Asp 


Gin 


Phe 


Ala 


Asp 


Ala 


Asp 


lie 


Arg 


Asn 


Tyr 


Ala 




50 










55 










60 










Gly 


Asn 


Val 


Trp 


Tyr 


Gin 


Arg 


Glu 


Val 


Phe 


He 


Pro 


Lys 


Gly 


Trp 


Ala 


65 










7 0 










75 










80 
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Gly 


Gin 


Arg 


He 


Val 


Trn 


Val 


Asn 








100 


Pro 


Phe 


Glu 


Ala 






115 




Arg 


He 


Thr 


Val 




130 






Pro 


Gly 


Met 


val 


145 








Phe 


His 


Asp 


Phe 




Thr 


Thr 


Pro 








180 


Val 


Ala 


Gin 


Asp 






195 




Asn 


Gly 


Asp 


Val 




210 






Ala 


Thr 


Gly 


Gin 


225 








Leu 


Trp 


Gin 


Pro 


Lys 


Ser 


bin 


Thr 








260 


Ser 


Val 


Ala 


Val 






275 




Tyr 


Phe 


Thr 


Gly 




290 






Gly 


Phe 


Asp 


Asn 


305 








He 


Gly 


Ala 


Asn 




Leu 


Asp 


Trp 








340 


Ala 


Ala 


Val 


Gly 






355 




Asn 


Lys 


Pro 


Lys 




370 






Gin 


Gin 


Ala 


His 


385 








Asn 


His 


Pro 


Ser 


Arg 


Pro 


bin 










420 


Arg 


Lys 


Leu 


Asp 






435 




Cys 


Asp 


Ala 


His 




450 






Leu 


Asn 


Arg 


Tyr 


465 








Ala 


Glu 


Lys 


Val 


His 


Gin 


Pro 


lie 








500 


Leu 


His 


Ser 


Met 






515 




Trp 


Leu 


Asp 


Met 




530 






Gly 


Glu 


Gin 


Val 


545 








Leu 


Arg 


Val 


Gly 


Pro 


Lys 


Ser 


Ala 






580 


Phe 


Gly 


Glu 


Lys 






595 





Val Leu Arg Phe 
85 

Asn Gin Glu Val 

Asp Val Thr Pro 
120 

Cys Val Asn Asn 
135 

lie Thr Asp Glu 
150 

Phe Asn Tyr Ala 
165 

Asn Thr Trp Val 

Cys Asn His Ala 
200 

Ser Val Glu Leu 
215 

Gly Thr Ser Gly 
230 

Gly Glu Gly Tyr 
245 

Glu Cys Asp He 

Lys Gly Glu Gin 
280 

Phe Gly Arg His 
295 

Val Leu Met Val 
310 

Ser Tyr Arg Thr 
325 

Ala Asp Glu His 

Phe Asn Leu Ser 
360 

Glu Leu Tyr Ser 
375 

Leu Gin Ala He 
390 

Val Val Met Trp 
405 

Ala Arg Glu Tyr 

Pro Thr Arg Pro 
440 

Thr Asp Thr lie 

455 

Tyr Gly Trp Tyr 
470 

Leu Glu Lys Glu 
485 

He He Thr Glu 

Tyr Thr Asp Met 
520 

Tyr His Arg Val 
535 

Trp Asn Phe Ala 
550 

Gly Asn Lys Lys 
565 

Ala Phe Leu Leu 

Pro Gin Gin Gly 
600 



-21- 



Asp 


Ala 


Val 


Thr 




90 






Met 


Glu 


His 


Gin 


105 








Tyr 


Val 


He 


Ala 


m ii 

oIU 


Leu 


Asn 


Trp 








140 


Asn 


Gly 


Lys 


Lys 






155 




Gly 


He 


His 


Arg 




170 






Asp 


Asp 


He 


Thr 


185 








Ser 


Val 


Asp 


Trp 


Arg 


Asp 


Ala 


Asp 








220 


Thr 


Leu 


Gin 


Val 






235 




Leu 


Tyr 


Glu 


Leu 




250 






Tyr 


Pro 


Leu 


Arg 


265 








Phe 


Leu 


He 


Asn 


olu 


Asp 


nlo 


Asp 








300 


His 


Asp 


His 


Ala 






315 




Ser 


His 


Tyr 


Pro 




330 






Gly 


He 


Val 


Val 


345 








Leu 


Gly 


He 


Gly 


Glu 


Glu 


Aia 


vai 








380 


Lys 


Glu 


Leu 


He 






395 




Ser 


He 


Ala 


Asn 




410 






Phe 


Ala 


Pro 


Leu 


425 








He 


Thr 


Cys 


Val 


Ser 


Asp 


Leu 


Phe 








460 


Val 


Gin 


Ser 


Gly 






475 




Leu 


Leu 


Ala 


Trp 




490 






Tyr 


Gly 


Val 


Asp 


505 








Trp 


Ser 


Glu 


Glu 


Phe 


Asp 


Arg 


Val 








540 


Asp 


Phe 


Ala 


Thr 






555 




Gly 


He 


Phe 


Thr 




570 






Gin 


Lys 


Arg 


Trp 


585 








Gly 


Lys 


Gin 





His Tyr Gly Lys 
95 

Gly Gly Tyr Thr 
110 

Gly Lys Ser Val 
125 

Gin Thr He Pro 

Lys Gin Ser Tyr 
160 

Ser Val Met Leu 
175 

Val Val Thr His 
190 

Gin Val Val Ala 
205 

Gin Gin Val Val 

Val Asn Pro His 
240 

Cys Val Thr Ala 
255 

Val Gly He Arg 
270 

His Lys Pro Phe 
285 

Leu Arg Gly Lys 

Leu Met Asp Trp 
320 

Tyr Ala Glu Glu 
335 

He Asp Glu Thr 
350 

Phe Glu Ala Gly 
365 

Asn Gly Glu Thr 

Ala Arg Asp Lys 
400 

Glu Pro Asp Thr 
415 

Ala Glu Ala Thr 
430 

Asn Val Met Phe 
445 

Asp Val Leu Cys 

Asp Leu Glu Thr 
480 

Gin Glu Lys Leu 
495 

Thr Leu Ala Gly 
510 

Tyr Gin Cys Ala 
525 

Ser Ala Val Val 

Ser Gin Gly lie 
560 

Arg Asp Arg Lys 
575 

Thr Gly Met Asn 
590 
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<210> 18 
<211> 277 
<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Nopaline Synthase Terminator Sequence 
<300> 

<308> Genbank #U09365 
<309> 1995-10-17 

<400> 18 

gagctcgaat ttccccgatc gttcaaacat ttggcaataa agtttcttaa gattgaatcc 60 

tgttgccggt cttgcgatga ttatcatata atttctgttg aattacgtta agcatgtaat 120 

aattaacatg taatgcatga cgttatttat gagatgggtt tttatgatta gagtcccgca 180 

attatacatt taatacgcga tagaaaacaa aatatagcgc gcaaactagg ataaattatc 24 0 

gcgcgcggtg tcatctatgt tactagatcg ggaattc 277 

<210> 19 
<211> 3438 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pLIT38attBZeo Plasmid 



<400> 19 

tcgaccctct 

gtcgtgactg 

tcgccagctg 

gcctgaatgg 

gttaactacg 

tttctaaata 

ataatattga 

ttttgcggca 

tgctgaagat 

gatccttgag 

gctatgtggc 

acactattct 

tggcatgaca 

caacttactt 

gggggatcat 

cgacgagcgt 

tggcgaacta 

agttgcagga 

tggagccggt 

ctcccgtatc 

acagatcgct 

ctcatatata 

aagattgtat 

aatttttgtt 

aaatcaaaag 

ctattaaaga 

ccactacgtg 

aatcggaacc 

gaaaggaagg 

cgctgcgcgt 

atctaggtga 

ttccactgag 

ctgcgcgtaa 

ccggatcaag 

ccaaatactg 

ccgcctacat 

tcgtgtctta 

tgaacggggg 

tacctacagc 



agtcaaggcc 
ggaaaaccct 
gcgtaatagc 
cgaatggcgc 
tcaggtggca 
cattcaaata 
aaaaggaaga 
ttttgccttc 
cagttgggtg 
agttttcgcc 
gcggtattat 
cagaatgact 
gtaagagaat 
ctgacaacga 
gtaactcgcc 
gacaccacga 
cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
aagcaaatat 
aaatcagctc 
aatagcccga 
acgtggactc 
aaccatcacc 
ctaaagggag 
gaagaaagcg 
aaccaccaca 
agatcctttt 
cgtcagaccc 
tctgctgctt 
agctaccaac 
ttcttctagt 
acctcgctct 
ccgggttgga 
gttcgtgcac 
gtgagctatg. 



ttaagtgagt 
ggcgttaccc 
gaagaggccc 
ttcgcttggt 
cttttcgggg 
tgtatccgct 
gtatgagtat 
ctgtttttgc 
cacgagtggg 
ccgaagaacg 
cccgtgttga 
tggttgagta 
tatgcagtgc 
tcggaggacc 
ttgatcgttg 
tgcctgtagc 
cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaccccg 
ttaaattgta 
attttttaac 
gatagggttg 
caacgtcaaa 
caaatcaagt 
cccccgattt 
aaaggagcgg 
cccgccgcgc 
tgataatctc 
cgtagaaaag 
gcaaacaaaa 
tctttttccg 
gtagccgtag 
gctaatcctg 
ctcaagacga 
acagcccagc 
agaaagcgcc 



cgtattacgg 
aact taatcg 
gcaccgatcg 
aataaagccc 
aaatgtgcgc 
catgagacaa 
tcaacatttc 
tcacccagaa 
ttacatcgaa 
ttctccaatg 
cgccgggcaa 
ctcaccagtc 
tgccataacc 
gaaggagcta 
ggaaccggag 
aatggcaaca 
acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
gttgataatc 
aacgttaata 
caataggccg 
agtgttgttc 
gggcgaaaaa 
tttttggggt 
agagcttgac 
gcgctagggc 
ttaatgcgcc 
atgaccaaaa 
atcaaaggat 
aaaccaccgc 
aaggtaactg 
ttaggccacc 
ttaccagtgg 
tagttaccgg 
ttggagcgaa 
acgcttcccg 



actggccgtc 
cct tgcagca 
cccttcccaa 
gcttcggcgg 
ggaaccccta 
taaccctgat 
cgtgtcgccc 
acgctggtga 
ctggatctca 
atgagcactt 
gagcaactcg 
acagaaaagc 
atgagtgata 
accgcttttt 
ctgaatgaag 
acgttgcgca 
gactggatgg 
tggtttattg 
ctggggccag 
actatggatg 
taactgtcag 
agaaaagccc 
ttttgttaaa 
aaatcggcaa 
cagtttggaa 
ccgtctatca 
cgaggtgccg 
ggggaaagcg 
gctggcaagt 
gctacagggc 
tcccttaacg 
cttcttgaga 
taccagcggt 
gcttcagcag 
acttcaagaa 
ctgctgccag 
ataaggcgca 
cgacctacac 
aagggagaaa 



gttttacaac 
catccccctt 
cagttgcgca 
gctttttttt 
tttgtttatt 
aaatgcttca 
ttattccctt 
aagtaaaaga 
acagcggtaa 
ttaaagttct 
gtcgccgcat 
atcttacgga 
acactgcggc 
tgcacaacat 
ccataccaaa 
aactattaac 

aggcggataa 

ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
caaaaacagg 
attcgcgtta 
aatcccttat 
caagagtcca 
gggcgatggc 
taaagcacta 
aacgtggcga 
gtagcggtca 
gcg t aaaagg 
tgagttttcg 
tccttttttt 
ggtttgtttg 
agcgcagata 
ctctgtagca 
tggcgataag 
gcggtcgggc 
cgaactgaga 
ggcggacagg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
.720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
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tatccggtaa 
gcctggtatc 
tgatgctcgt 
ttcctggcct 
accccaggct 
acaatttcac 
ctagtggggc 
tgctttttta 
ccggtgctca 
ttctcccggg 
ttcatcagcg 
cgcggcctgg 
gcctccgggc 
cgcgacccgg 
cgagatttcg 
gacgccggct 
aacttgttta 
aataaagcat 
tatcatgtct 



gcggcagggt 
tttatagtcc 
caggggggcg 
tttgctggcc 
ttacacttta 
acaggaaaca 
ccgtgcaatt 
tactaacttg 
ccgcgcgcga 
acttcgtgga 
cggtccagga 
acgagctgta 
cggccatgac 
ccggcaactg 
attccaccgc 
ggatgatcct 
ttgcagct ta 
ttttttcact 
gtataccg 



cggaacagga 
tgtcgggttt 
gagcctatgg 
ttttgctcac 
tgcttccggc 
gctatgacca 
gaagccggct 
agcgaaatct 
cgtcgccgga 
ggacgacttc 
ccaggtggtg 
cgccgagtgg 
cgagatcggc 
cgtgcacttc 
cgccttctat 
ccagcgcggg 
taatggttac 
gcattctagt 



gagcgcacga 
cgccacctct 
aaaaacgcca 
atgtaatgtg 
tcgtatgttg 
tgattacgcc 
ggcgccaagc 
ggatccatgg 
gcggtcgagt 
gccggtgtgg 
ccggacaaca 
tcggaggtcg 
gagcagccgt 
gtggccgagg 
gaaaggttgg 
gatctcatgc 
aaataaagca 
tgtggtttgt 



gggagcttcc 
gacttgagcg 
gcaacgcggc 
agttagctca 
tgtggaattg 
aagctacgta 
ttctctgcag 
ccaagttgac 
tctggaccga 
tccgggacga 
ccctggcctg 
tgtccacgaa 
gggggcggga 
agcaggactg 
gcttcggaat 
tggagttctt 
atagcatcac 
ccaaactcat 



a ggggg aa ac 

tcgatttttg 
ctttttacgg 
ctcattaggc 
tgagcggata 
atacgactca 
gattgaagcc 
cagtgccgtt 
ccggctcggg 
cgtgaccctg 
ggtgtgggtg 
cttccgggac 
gttcgccctg 
acacgtgcta 
cgttttccgg 
cgcccacccc 
aaatttcaca 
caatgtatct 



<210> 20 
<211> 3451 
<212> DNA 

<213> Artificial Sequence 
<220> 

<22 3> Hindi II Fragment containing the beta -glucuronidase 
coding sequence, the rDNA intergenic spacer, and 
the Mastl sequence 



<400> 20 

aagcttgacc 

ttaggacgtg 

gtaggacgtg 

acgacttgaa 

gactccgcgg 

gttggtggtt 

gatcggcgat 

cacaatggag 

ttaagtattt 

taaatagttt 

tacttgatgt 

ttttttttgt 

tatttttgta 

gttgtacttc 

catgtctact 

tagactgaag 

tgacccccgc 

gttgaaggag 

aaaccattat 

gtcaaaaatg 

ttcactctca 

ttcactagtg 

cccgtgaaat 

gaattgagca 

gcagttttaa 

atcagcgcga 

atgcggtcac 

gcggctatac 

gtatcacagt 

ttaccgacga 

ggatccatcg 

tggtgacgca 

atggtgatgt 

gcaccagcgg 

tctatgaact 

tcggcatccg 



tggaatatcg 
aaatatggcg 
gaatatggca 
aaatgacgaa 
gaattcgatt 
ggtggtcgtg 
ggttggtgtt 
gtgcgtcatg 
tacctatttt 
ttatcgtact 
attggaaatt 
tttattatgt 
aaatatatca 
tttttgtgca 
cctgtcactt 
gcgggaaacg 
cgatgacgcg 
ccactcagcc 
tgcgcgttca 
ctccactgac 
atccaaataa 
gatccccggg 
caaaaaactc 
gcgttggtgg 
cgatcagttc 
agtctttata 
tcattacggc 
gccatttgaa 
ttgtgtgaac 
aaacggcaag 
cagcgtaatg 
tgtcgcgcaa 
cagcgttgaa 
gactttgcaa 
gtacgtcaca 
gtcagtggca 



cgagtaaact 
aggaaaactg 
agaaaactga 
atcactaaaa 
gtgctagcca 
gctggcggtg 
tgcagcggtg 
gttattggtg 
ttacatattt 
tgttttataa 
ttctccattg 
attttttcgt 
tttacaatgt 
tttggtgttg 
gggttttttt 
acaatctgat 
ggacaagccg 
gcgggtttct 
aaagtcgcct 
gttccataaa 
tctgcaccgg 
tacggtcagt 
gacggcctgt 
gaaagcgcgt 
gccgatgcag 
ccgaaaggtt 
aaagtgtggg 
gccgatgtca 
aacgaactga 
aaaaagcagt 
ctctacacca 
gactgtaacc 
ctgcgtgatg 
gtggtgaatc 
gccaaaagcc 
gtgaagggcg 



gaaaatcacg 
aaaaaggtgg 
aaatcatgga 
aacgtgaaaa 
atgtttaaca 
gtggaaaatt 
tttgatatcg 
gttggtcatc 
tttattaaat 
aatattttat 
ttttttctat 
tttataataa 
ttaaaagtca 
tacatgtcta 
ttttaagaca 
catgagcgga 
ttttacgttt 
ggagtttaat 
aaggtcacta 
ttcccctcgg 
atctcgagat 
cccttatgtt 
gggcattcag 
tacaagaaag 
atattcgtaa 
gggcaggcca 
tcaataatca 
cgccgtatgt 
actggcagac 
cttacttcca 
cgccgaacac 
acgcgtctgt 
cggatcaaca 
cgcacctctg 
agacagagtg 
aacagttcct 



gaaaatgaga 
aaaatttaga 
aaatgagaaa 
atgagaaatg 
agatgtcaag 
gcggtggttc 
gaatcactta 
tatatatttt 
ttatgcattg 
tattttatgt 
atttataata 
atatttatta 
tttgtgaata 
ttatgattct 
taatcactag 
gaattaaggg 
ggaactgaca 
gagctaagca 
tcagctagca 
tatccaatta 
cgaattcccg 
acgtcctgta 
tctggatcgc 
ccgggcaatt 
ttatgtgggc 
gcgtatcgtg 
ggaagtgatg 
tattgccggg 
tatcccgccg 
tgatttcttt 
ctgggtggac 
tgactggcag 
ggtggttgca 
gcaaccgggt 
tgatatctac 
gatcaaccac 



aatacacact 
aatgtccact 
catccacttg 
cacactgaag 
cacaatgaat 
gagcggtagt 
tggtggttgt 
tataataata 
tttgtatttt 
gttatattat 
attttcttat 
aaaaaaatat 
tattagctaa 
ctggccaaaa 
tgattatatc 
agtcacgtta 
gaaccgcaac 
catacgtcag 
aatatttctt 
gagtctcata 
cggccgcgaa 
gaaaccccaa 
gaaaactgtg 
gctgtgccag 
aacgtctggt 
ctgcgtttcg 
gagcatcagg 
aaaagtgtac 
ggaatggtga 
aactacgccg 
gatatcaccg 
gtggtggcca 
actggacaag 
gaaggttatc 
ccgctgcgcg 
aaaccgttct 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3438 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 
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actttactgg ctttggccgt catgaagatg cggatttgcg cggcaaagga ttcgataacg 2220 

tgctgatggt gcacgatcac gcattaatgg actggattgg ggccaactcc taccgtacct 2280 

cgcattaccc ttacgctgaa gagatgctcg actgggcaga tgaacatggc atcgtggtga 2340 

ttgatgaaac tgcagctgtc ggctttaacc tctctttagg cattggtttc gaagcgggca 2400 

acaagccgaa agaactgtac agcgaagagg cagtcaacgg ggaaactcag caggcgcact 2460 

tacaggcgat taaagagcfcg atagcgcgtg acaaaaacca cccaagcgtg gtgatgtgga 2520 

gtattgccaa cgaaccggat acccgtccgc aaggtgcacg ggaatatttc gcgccactgg 2580 

cggaagcaac gcgtaaactc gatccgacgc gtccgatcac ctgcgtcaat gtaatgttct 2640 

gcgacgctca caccgatacc atcagcgatc tctttgatgt gctgtgcctg aaccgttatt 2700 

acggttggta tgtccaaagc ggcgatttgg aaacggcaga gaaggtactg gaaaaagaac 2760 

ttctggcctg gcaggagaaa ctgcatcagc cgattatcat caccgaatac ggcgtggata 2 820 

cgttagccgg gctgcactca atgtacaccg acatgtggag tgaagagtat cagtgtgcat 2 8 80 

ggctggatat gtatcaccgc gtctttgatc gcgtcagcgc cgtcgtcggt gaacaggtat 2940 

ggaatttcgc cgattttgcg acctcgcaag gcatattgcg cgttggcggt aacaagaagg 3 0 00 

ggatcttcac ccgcgaccgc aaaccgaagt cggcggcttt tctgctgcaa aaacgctgga 3 0 60 

ctggcatgaa cttcggtgaa aaaccgcagc agggaggcaa acaatgaatc aacaactctc 312 0 

ctggcgcacc atcgtcggct acagcctcgg gaattgcgta ccgagctcga atttccccga 3180 

tcgttcaaac atttggcaat aaagtttctt aagattgaat cctgttgccg gtcttgcgat 3240 

gattatcata taatttctgt tgaattacgt taagcatgta ataattaaca tgtaatgcat 330O 

gacgttattt atgagatggg tttttatgat tagagtcccg caattataca tttaatacgc 33 60 

gatagaaaac aaaatatagc gcgcaaacta ggataaatta tcgcgcgcgg tgtcatctat 34 2 0 

gttactagat cgggaattcg atatcaagct t 34 51 

<210> 21 
<211> 14627 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pAglla Plasmid 
<400> 21 

catgccaacc acagggttcc cctcgggatc aaagtacttt gatccaaccc ctccgctgct 60 

atagtgcagt cggcttctga cgttcagtgc agccgtcttc tgaaaacgac atgtcgcaca 12 0 

agtcctaagt tacgcgacag gctgccgccc tgcccttttc ctggcgtttt cttgtcgcgt 180 

gttttagtcg cataaagtag aatacttgcg actagaaccg gagacattac gccatgaaca 24 0 

agagcgccgc cgctggcctg ctgggctatg cccgcgtcag caccgacgac caggacttga 3 00 

ccaaccaacg ggccgaactg cacgcggccg gctgcaccaa gctgttttcc gagaagatca 3 60 

ccggcaccag gcgcgaccgc ccggagctgg ccaggatgct tgaccaccta cgccctggcg 420 

acgttgtgac agtgaccagg ctagaccgcc tggcccgcag cacccgcgac ctactggaca 4 BO 

ttgccgagcg catccaggag gccggcgcgg gcctgcgtag cctggcagag ccgtgggccg 54 0 

acaccaccac gccggccggc cgcatggtgt tgaccgtgtt cgccggcatt gccgagttcg 60 0 

agcgttccct aatcatcgac cgcacccgga gcgggcgcga ggccgccaag gcccgaggcg 66 0 

tgaagtttgg cccccgccct accctcaccc cggcacagat cgcgcacgcc cgcgagctga 720 

tcgaccagga aggccgcacc gtgaaagagg cggctgcact gcttggcgtg catcgctcga 7 80 

ccctgtaccg cgcacttgag cgcagcgagg aagtgacgcc caccgaggcc aggcggcgcg 84 0 

gtgccttccg tgaggacgca ttgaccgagg ccgacgccct ggcggccgcc gagaatgaac 900 

gccaagagga acaagcatga aaccgcacca ggacggccag gacgaaccgt ttttcattac 960 

cgaagagatc gaggcggaga tgatcgcggc cgggtacgtg ttcgagccgc ccgcgcacgt 1020 

ctcaaccgtg cggctgcatg aaatcctggc cggtttgtct gatgccaagc tggcggcctg 10 80 

gccggccagc ttggccgctg aagaaaccga gcgccgccgt ctaaaaaggt gafcgtgtatt 1140 

tgagtaaaac agcttgcgtc atgcggtcgc tgcgtatatg atgcgatgag taaataaaca 1200 

aatacgcaag gggaacgcat gaaggttatc gctgtactta accagaaagg cgggtcaggc 12 60 

aagacgacca tcgcaaccca tctagcccgc gccctgcaac tcgccggggc cgatgttctg 1320 

ttagtcgatt ccgatcccca gggcagtgcc cgcgattggg cggccgtgcg ggaagatcaa 1380 

ccgctaaccg ttgtcggcat cgaccgcccg acgattgacc gcgacgtgaa ggccatcggc 1440 

cggcgcgact tcgtagtgat cgacggagcg ccccaggcgg cggacttggc tgtgtccgcg 1500 

atcaaggcag ccgacttcgt gctgattccg gtgcagccaa gcccttacga catatgggcc 1560 

accgccgacc tggtggagct ggttaagcag cgcattgagg tcacggatgg aaggctacaa 1620 

gcggcctttg tcgtgtcgcg ggcgatcaaa ggcacgcgca tcggcggtga ggttgccgag 16 80 

gcgctggccg ggtacgagct gcccattctt gagtcccgta tcacgcagcg cgtgagctac 174 0 

ccaggcactg ccgccgccgg cacaaccgtt cttgaatcag aacccgaggg cgacgctgcc 18 00 

cgcgaggtcc aggcgctggc cgctgaaatt aaatcaaaac tcatttgagt taatgaggta 1860 

aagagaaaat gagcaaaagc acaaacacgc taagtgccgg ccgtccgagc gcacgcagca 1920 

gcaaggctgc aacgttggcc agcctggcag acacgccagc catgaagcgg gtcaactttc 1980 

agttgccggc ggaggatcac accaagctga agatgtacgc ggtacgccaa ggcaagacca 2 04 0 

ttaccgagct gctatctgaa tacatcgcgc agctaccaga gtaaatgagc aaatgaataa 2100 



WO 2002/096923 



PCT/US2002/017451 



-25- 



atgagtagat gaattttagc ggctaaagga ggcggcatgg aaaatcaaga acaaccaggc 2160 
accgacgccg tggaatgccc catgtgtgga ggaacgggcg gttggccagg cgtaagcggc 2220 
tgggttgtct gccggccctg caatggcact ggaaccccca agcccgagga atcggcgtga 22 80 
cggtcgcaaa ccatccggcc cggtacaaat cggcgcggcg ctgggtgatg acctggtgga 234 0 
gaagttgaag gccgcgcagg ccgcccagcg gcaacgcatc gaggcagaag cacgccccgg 24 00 
tgaatcgtgg caagcggccg ctgatcgaat ccgcaaagaa tcccggcaac cgccggcagc 24 60 
cggtgcgccg tcgattagga agccgcccaa gggcgacgag caaccagatt ttttcgttcc 2520 
gatgctctat gacgtgggca cccgcgatag tcgcagcatc atggacgtgg ccgttttccg 25 80 
tctgtcgaag cgtgaccgac gagctggcga ggtgatccgc tacgagcttc cagacgggca 2640 
cgtagaggtt tccgcagggc cggccggcat ggccagtgtg tgggattacg acctggtact 27 00 
gatggcggtt tcccatctaa ccgaatccat gaaccgatac cgggaaggga agggagacaa 2760 
gcccggccgc gtgttccgtc cacacgttgc ggacgtactc aagttctgcc ggcgagccga 2820 
tggcggaaag cagaaagacg acctggtaga aacctgcatt cggttaaaca ccacgcacgt 2880 
tgccatgcag cgtacgaaga aggccaagaa cggccgcctg gtgacggtat ccgagggtga 294 0 
agccttgatt agccgctaca agatcgtaaa gagcgaaacc gggcggccgg agtacatcga 3000 
gatcgagcta gctgattgga tgtaccgcga gatcacagaa ggcaagaacc cggacgtgct 3060 
gacggttcac cccgattact ttttgatcga tcccggcatc ggccgttttc tctaccgcct 3120 
ggcacgccgc gccgcaggca aggcagaagc cagatggttg ttcaagacga tctacgaacg 3180 
cagtggcagc gccggagagt tcaagaagtt ctgtttcacc gtgcgcaagc tgatcgggtc 3240 
aaatgacctg ccggagtacg atttgaagga ggaggcgggg caggctggcc cgatcctagt 3300 
catgcgctac cgcaacctga tcgagggcga agcatccgcc ggttcctaat gtacggagca 3360 
gatgctaggg caaattgccc tagcagggga aaaaggtcga aaaggtctct ttcctgtgga 3420 
tagcacgtac attgggaacc caaagccgta cattgggaac cggaacccgt acattgggaa 3480 
cccaaagccg tacattggga accggtcaca catgtaagtg actgatataa aagagaaaaa 3 54 0 
aggcgatttt tccgcctaaa actctttaaa acttattaaa actcttaaaa cccgcctggc 3600 
ctgtgcataa ctgtctggcc agcgcacagc cgaagagctg caaaaagcgc ctacccttcg 3660 
gtcgctgcgc tccctacgcc ccgccgcttc gcgtcggcct atcgcggccg ctggccgctc 3720 
aaaaatggct ggcctacggc caggcaatct accagggcgc ggacaagccg cgccgtcgcc 3780 
actcgaccgc cggcgcccac atcaaggcac cctgcctcgc gcgtttcggt gatgacggtg 3840 
aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg 3 900 
ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggcgcagcca 3 960 
tgacccagtc acgtagcgat agcggagtgt atactggctt aactatgcgg catcagagca 4 020 
gattgtactg agagtgcacc atatgcggtg tgaaataccg cacagatgcg taaggagaaa 4 0 80 
ataccgcatc aggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 4140 
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 4200 
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 42 60 
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 4320 
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 4380 
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 4440 
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 4500 
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 4560 
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 4 620 
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 4 6 80 
gttcttgaag tggtggccta actacggcta cactagaagg acagtatttg gtatctgcgc 4740 
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 4800 
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 4860 
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 4920 
acgttaaggg attttggtca tgcattctag gtactaaaac aattcatcca gtaaaatata 4 980 
atattttatt ttctcccaat caggcttgat ccccagtaag tcaaaaaata gctcgacata 5040 
ctgttcttcc ccgafcatcct ccctgatcga ccggacgcag aaggcaatgt cataccactt 5100 
gtccgccctg ccgcttctcc caagatcaat aaagccactt actttgccat ctttcacaaa 5160 
gatgttgctg tctcccaggt cgccgtggga aaagacaagt tcctcttcgg gcttttccgt 5220 
ctttaaaaaa tcatacagct cgcgcggatc tttaaatgga gtgtcttctt cccagttttc 5280 
gcaatccaca tcggccagat cgttattcag taagtaatcc aattcggcta agcggctgtc 5340 
taagctattc gtatagggac aatccgatat gtcgatggag tgaaagagcc tgatgcactc 5400 
cgcatacagc tcgataatct tttcagggct ttgttcatct tcatactctt ccgagcaaag 5460 
gacgccatcg gcctcactca tgagcagatt gctccagcca tcatgccgtt caaagtgcag 552 0 
gacctttgga acaggcagct ttccttccag ccatagcatc atgtcctttt cccgttccac 5580 
atcataggtg gtccctttat accggctgtc cgtcattttt aaatataggt tttcattttc 5640 
tcccaccagc t tat at ace t tagcaggaga cattccttcc gtatctttta cgcagcggta 5700 
tttttcgatc agttttttca attccggtga tattctcatt ttagccattt attatttcct 5760 
tcctcttttc tacagtattt aaagataccc caagaagcta attataacaa gacgaactcc 5820 
aattcactgt tecttgeatt ctaaaacctt aaataccaga aaacagcttt ttcaaagttg 5880 
ttttcaaagt tggcgtataa catagtatcg aeggagcega ttttgaaacc gcggtgatca 5940 
caggcagcaa cgctctgtca tcgttacaat caacatgeta ccctccgcga gatcatccgt 6000 
gtttcaaacc eggcagctta gttgccgttc ttccgaatag categgtaac atgagcaaag 6060 
tctgccgcct tacaaegget ctcccgctga cgccgtcccg gactgatggg ctgcctgtat 612 0 
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cgagtggtga 
tatattgtgg 
taatgtactg 
gttttaggaa 
ggtttcttat 
ggaactactc 
ggacggggcg 
ccgtgcttga 
atgcgcacgc 
gcctccaggg 
cggggggaga 
gggcccgcgt 
cgctcccgca 
aagttgaccg 
gcctcggtgg 
gagatagatt 
ttccttatat 
agtggagata 
cacgatgctc 
aacgatagcc 
tgtccttttg 
taccctttgt 
cttggagtag 
agacgtggtt 
gggaccactg 
ttfcgtaggfcg 
atggaatccg 
gtcttctgag 
gttggcaagc 
taatgcagct 
aatgtgagtt 
atgttgtgtg 
tacgaattcg 
gagtttggac 
gatgctattg 
gaactccagc 
tccgaagccc 
gtcctgctcc 
ccgcccccac 
cgtggacacg 
ggccagggtg 
gtcccggacc 
ggtccagaac 
caacttggcc 
gcaggaattc 
accaaagggc 
attgcccagc 
aatgccatca 
ccaaagatgg 
cttcaaagca 
agaatatcaa 
taatatcggg 
cagtagaaaa 
ttcaagatgc 
tggaaaaaga 
ctgacgtaag 
aagttcattt 
tcfcctcgagc 
cgacgtctgt 
tctcggaggg 
tgcgggtaaa 
catcggccgc 
ccfcattgcat 
tgcccgctgt 
gccagacgag 
gtgatttcat 
acaccgtcag 



tt ttgtgccg 
tgtaaacaaa 
aattaacgcc 
ttagaaattt 
atgctcaaca 
acacattatt 
gtaccggcag 
agccggccgc 
tcgggtcgtt 
acttcagcag 
cgtacacggt 
aggcgatgcc 
gacggacgag 
tgcttgtctc 
cacggcggat 
tgtagagaga 
agaggaaggt 
tcacatcaat 
ctcgtgggtg 
tttcctttat 
atgaagtgac 
tgaaaagtct 
acgagagtgt 
ggaacgtctt 
tcggcagagg 
ccaccttcct 
aggaggtttc 
actgtatctt 
tgctctagcc 
ggcacgacag 
agctcactca 
gaattgtgag 
agccttgact 
aaaccacaac 
ctttatttgt 
atgagatccc 
aacctttcat 
tcggccacga 
ggctgctcgc 
acctccgacc 
ttgtccggca 
acaccggcga 
tcgaccgctc 
atggatccag 
gatcgacact 
tattgagact 
tatctgtcac 
ttgcgataaa 
acccccaccc 
agtggattga 
agatacagtc 
aaacctcctc 
ggaaggtggc 
ctctgccgac 
agacgttcca 
ggatgacgca 
catttggaga 
tttcgcagat 
cgagaagttt 
cgaagaatct 
tagctgcgcc 
gctcccgatt 
ctcccgccgt 
tctacaaccg 
cgggttcggc 
atgcgcgatt 
tgcgtccgtc 



agctgccggt 
ttgacgctta 
gaattaattc 
tattgataga 
catgagcgaa 
atggagaaac 
gctgaagtcc 
ccgcagcatg 
gggcagcccg 
gtgggtgtag 
cgactcggcc 
ggcgacctcg 
gtcgtccgtc 
gatgtagtgg 
gtcggccggg 
gactggtgat 
cttgcgaagg 
ccacttgctt 
ggggtccatc 
cgcaatgatg 
agatagctgg 
caatagccct 
cgtgctccac 
ctttttccac 
catcttgaac 
tttctactgt 
ccgatattac 
tgatattctt 
aatacgcaaa 
gtttcccgac 
ttaggcaccc 
cggataacaa 
agagggtcga 
tagaatgcag 
aaccattata 
cgcgctggag 
agaaggcggc 
agtgcacgca 
cgatctcggt 
actcggcgta 
ccacctggtc 
agtcgtcctc 
cggcgacgtc 
atttcgctca 
ctcgtctact 
tttcaacaaa 
ttcatcaaaa 
ggaaaggcta 
acgaggagca 
tgtgataaca 
tcagaagacc 
ggattccatt 
acctacaaat 
agtggtccca 
accacgtctt 
caatcccact 
ggacacgctg 
ccgggggggc 
ctgatcgaaa 
cgtgctttca 
gatggtttct 
ccggaagtgc 
gcacagggtg 
gtcgcggagg 
ccattcggac 
gctgatcccc 
gcgcaggctc 



cggggagctg 
gacaacttaa 
gggggatctg 
agtattttac 
accctatagg 
tcgagtcaaa 
agctgccaga 
ccgcgggggg 
atgacagcga 
agcgtggagc 
gtccagtcgt 
ccgtccacct 
cactcctgcg 
ttgacgatgg 
cgtcgttctg 
ttcagcgtgt 
atagtgggat 
tgaagacgfcg 
tttgggacca 
gcatttgtag 
gcaatggaat 
ttggtcttct 
catgttatca 
gatgctcctc 
gatagccttt 
ccttttgatg 
cctttgttga 
ggagtagacg 
ccgcctctcc 
tggaaagcgg 
caggctttac 
tttcacacag 
cggtatacag 
tgaaaaaaat 
agctgcaata 
gatcatccag 
ggtggaatcg 
gttgccggcc 
catggccggc 
cagctcgtcc 
ctggaccgcg 
cacgaagtcc 
gcgcgcggtg 
agttagtata 
ccaagaatat 
gggtaatatc 
ggacagtaga 
tcgttcaaga 
tcgtggaaaa 
tggtggagca 
aaagggctat 
gcccagctat 
gccatcattg 
aagatggacc 
caaagcaagt 
atccttcgca 
aaatcaccag 
aatgagatat 
agttcgacag 
gcttcgatgt 
acaaagatcg 
ttgacattgg 
tcacgttgca 
ctatggatgc 
cgcaaggaat 
atgtgtatca 
tcgatgagct 



ttggctggct 
taacacattg 
gattttagta 
aaatacaaat 
aaccctaatt 
tctcggtgac 
aacccacgtc 
catatccgag 
ccacgctctt 
ccagtcccgt 
aggcgttgcg 
cggcgacgag 
gttcctgcgg 
tgcagaccgc 
ggctcatggt 
cctctccaaa 
tgtgcgtcat 
gttggaacgt 
ctgtcggcag 
gtgccacctt 
ccgaggaggt 
gagactgtat 
catcaatcca 
gtgggtgggg 
cctttatcgc 
aagtgacaga 
aaagtctcaa 
agagtgtcgt 
ccgcgcgttg 
gcagtgagcg 
actttatgct 
gaaacagcta 
acatgataag 
gctttatttg 
aacaagttgg 
ccggcgtccc 
aaatctcgta 
gggtcgcgca 
ccggaggcgt 
aggccgcgca 
ctgatgaaca 
cgggagaacc 
agcaccggaa 
aaaaagcagg 
caaagataca 
gggaaacctc 
aaaggaaggt 
tgcctctgcc 
agaagacgtt 
cgacactctc 
tgagactttt 
ctgtcacttc 
cgataaagga 
cccacccacg 
ggattgatgt 
agaccttcct 
tctctctcta 
gaaaaagcct 
cgtctccgac 
aggagggcgt 
ttatgtttat 
ggagtttagc 
agacctgcct 
gatcgctgcg 
cggtcaatac 
ctggcaaact 
gatgctttgg 



ggtggcagga 
cggacgtttt 
ctggattttg 
acatactaag 
cccttatctg 
gggcaggacc 
atgccagttc 
cgcctcgtgc 
gaagccctgt 
ccgctggtgg 
tgccttccag 
ccagggatag 
ctcggtacgg 
cggcatgtcc 
agactcgaga 
tgaaatgaac 
cccttacgtc 
cttctttttc 
aggcatcttg 
ccttttctac 
ttcccgatat 
ctttgatatt 
cttgctttga 
gtccatcttt 
aatgatggca 
tagctgggca 
tagccctttg 
gctccaccat 
gccgattcat 
caacgcaatt 
tccggctcgt 
tgaccatgat 
atacattgat 
tgaaatttgt 
ggtgggcgaa 
ggaaaacgat 
gcacgtgtca 
gggcgaactc 
cccggaagtt 
cccacaccca 
gggtcacgtc 
cgagccggtc 
cggcactggt 
cttcaatcct 
gtctcagaag 
ctcggattcc 
ggcacctaca 
gacagtggtc 
ccaaccacgt 
gtctactcca 
caacaaaggg 
atcaaaagga 
aaggctatcg 
a gg a gcatcg 
gatatctcca 
ctatataagg 
caaatctatc 
gaactcaccg 
ctgatgcagc 
ggatatgtcc 
cggcactttg 
gagagcctga 
gaaaccgaac 
gccgatctta 
actacatggc 
gtgatggacg 
gccgaggact 



6180 

6240 

6300 

6360 

6420 

6480 

6540 
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gccccgaagt 
atggccgcat 
aggtcgccaa 
acttcgagcg 
gcattggtct 
gggcgcaggg 
aaatcgcccg 
gtggaaaccg 
atctgtcgat 
ggaattaggg 
gtatttgtat 
agtactaaaa 
gaatatcgcg 
atatggcgag 
atatggcaag 
atgacgaaat 
attcgattgt 
tggtcgtggc 
ttggtgtttg 
gcgtcatggt 
cctatttttt 
atcgtacttg 
tggaaatttt 
tattatgtat 
atatatcatt 
tttgtgcatt 
tgtcacttgg 
gggaaacgac 
atgacgcggg 
actcagccgc 
cgcgttcaaa 
ccactgacgt 
ccaaataatc 
tccccgggta 
aaaaactcga 
gttggtggga 
atcagttcgc 
tctttatacc 
attacggcaa 
catttgaagc 
gtgtgaacaa 
acggcaagaa 
gcgtaatgct 
tcgcgcaaga 
gcgttgaact 
ctttgcaagt 
acgtcacagc 
cagtggcagt 
ttggccgtca 
acgatcacgc 
acgctgaaga 
cagctgtcgg 
aactgtacag 
aagagctgat 
aaccggatac 
gtaaactcga 
ccgataccat 
tccaaagcgg 
aggagaaact 
tgcactcaat 
atcaccgcgt 
attttgcgac 
gcgaccgcaa 
tcggtgaaaa 
cgtcggctac 
ttggcaataa 
atttctgttg 



ccggcacctc 
aacagcggtc 
catcttcttc 
gaggcatccg 
tgaccaactc 
tcgatgcgac 
cagaagcgcg 
acgccccagc 
cgacaagctc 
ttcctatagg 
ttgtaaaata 
tccagatccc 
agtaaactga 
gaaaactgaa 
aaaactgaaa 
cactaaaaaa 
gctagccaat 
tggcggtggt 
cagcggtgtt 
tattggtggt 
acatattttt 
ttttataaaa 
ctccattgtt 
tttttcgttt 
tacaatgttt 
tggtgttgta 
gttttttttt 
aatctgatca 
acaagccgtt 
gggtttctgg 
agtcgcctaa 
tccataaatt 
tgcaccggat 
cggtcagtcc 
cggcctgtgg 
aagcgcgtta 
cgatgcagat 
gaaaggttgg 
agtgtgggtc 
cgatgtcacg 
cgaactgaac 
aaagcagtct 
ctacaccacg 
ctgtaaccac 
gcgtgatgcg 
ggtgaatccg 
caaaagccag 
gaagggcgaa 
tgaagatgcg 
attaatggac 
gatgctcgac 
ctttaacctc 
cgaagaggca 
agcgcgtgac 
ccgtccgcaa 
tccgacgcgt 
cagcgatctc 
cgatttggaa 
gcatcagccg 
gtacaccgac 
ctttgatcgc 
ctcgcaaggc 
accgaagtcg 
accgcagcag 
agcctcggga 
agtttcttaa 
aattacgtta 



gtgcacgcgg 
attgactgga 
tggaggccgt 
gagcttgcag 
tatcagagct 
gcaatcgtcc 
gccgtctgga 
actcgtccga 
gagtttctcc 
gtttcgctca 
cttctatcaa 
ccgaattaat 
aaatcacgga 
aaaggtggaa 
atcatggaaa 
cgtgaaaaat 
gtttaacaag 
ggaaaattgc 
tgatatcgga 
tggtcatcta 
tattaaattt 
tattttatta 
ttttctatat 
tataataaat 
aaaagtcatt 
catgtctatt 
ttaagacata 
tgagcggaga 
ttacgtttgg 
agtttaatga 
ggtcactatc 
cccctcggta 
ctcgagatcg 
cttatgttac 
gcattcagtc 
caagaaagcc 
attcgtaatt 
gcaggccagc 
aataatcagg 
ccgtatgtta 
tggcagacta 
tacttccatg 
ccgaacacct 
gcgtcfcgttg 
gatcaacagg 
cacctctggc 
acagagtgtg 
cagttcctga 
gatttgcgcg 
tggattgggg 
tgggcagatg 
tctttaggca 
gtcaacgggg 
aaaaaccacc 
ggtgcacggg 
ccgatcacct 
tttgatgtgc 
acggcagaga 
at tatcatca 
atgtggagtg 
gtcagcgccg 
atattgcgcg 
gcggcttttc 
ggaggcaaac 
attgcgtacc 
gattgaatcc 
agcatgtaat 



atttcggctc 
gcgaggcgat 
ggttggcttg 
gatcgccacg 
tggttgacgg 
gatccggagc 
ccgatggctg 
gggcaaagaa 
ataataatgt 
tgtgttgagc 
taaaatttct 
tcggcgttaa 
aaatgagaaa 
aatttagaaa 
atgagaaaca 
gagaaatgca 
atgtcaagca 
ggtggt tcga 
atcacttatg 
tatattttta 
atgcattgtt 
ttttatgtgt 
ttataataat 
atttat taaa 
tgtgaatata 
atgattctct 
atcactagtg 
attaagggag 
aactgacaga 
gctaagcaca 
agctagcaaa 
tccaattaga 
aattcccgcg 
gtcctgtaga 
tggatcgcga 
gggcaattgc 
atgtgggcaa 
gtatcgtgct 
aagtgatgga 
ttgccgggaa 
tcccgccggg 
atttctttaa 

gggtggacga 

actggcaggt 
tggttgcaac 
aaccgggtga 
atatctaccc 
tcaaccacaa 
gcaaaggatt 
ccaactccta 
aacatggcat 
ttggtttcga 
aaactcagca 
caagcgtggt 
aatatttcgc 
gcgtcaatgt 
tgtgcctgaa 
aggtactgga 
ccgaatacgg 
aagagtatca 
tcgtcggtga 
ttggcggtaa 
tgctgcaaaa 
aatgaatcaa 
gagctcgaat 
tgttgccggt 
aattaacatg 



caacaatgtc 
gttcggggat 
tatggagcag 
actccgggcg 
caatttcgat 
cgggactgtc 
tgtagaagta 
atagagtaga 
gtgagtagtt 
atataagaaa 
aat tec taaa 
ttcagatcaa 
tacacacttt 
tgtccactgt 
tccacttgac 
cactgaagga 
caatgaatgt 
gcggtagtga 
gtggttgtca 
taataatatt 
tgtattttta 
tatattatta 
tttcttattt 
aaaaatatta 
ttagctaagt 
ggecaaaaca 
attatatcta 
tcacgttatg 
accgcaacgt 
tacgtcagaa 
tatttcttgt 
gtctcatatt 
gecgegaatt 
aaccccaacc 
aaactgtgga 
tgtgccaggc 
cgtctggtat 
gcgtttcgat 
gcatcagggc 
aagtgtacgt 
aatggtgatt 
ctacgccggg 
tatcacegtg 
ggtggccaat 
tggacaaggc 
aggttatctc 
gctgcgcgtc 
accgttctac 
egataaegtg 
ccgtacctcg 
cgtggtgatt 
agegggcaac 
ggegcactta 
gatgtggagt 
gccactggcg 
aatgttctgc 
ccgttattac 
aaaagaactt 
cgtggatacg 
gtgtgcatgg 
acaggtatgg 
caagaagggg 
aegctggact 
caactctcct 
ttccccgatc 
ettgegatga 
taatgcatga 



ctgaeggaca 
tcccaatacg 
cagacgcgct 
tatatgetec 
gatgeagett 
gggegtacac 
ctcgccgata 
tgccgaccgg 
cccagataag 
cccttagtat 
accaaaatcc 
gcttgacctg 
aggacgtgaa 
aggacgtgga 
gacttgaaaa 
ctccgcggga 
tggtggttgg 
teggegatgg 
caatggaggt 
aagtatttta 
aatagttttt 
cttgatgtat 
ttttttgttt 
tttttgtaaa 
tgtacttctt 
tgtctactcc 
gac t gaaggc 
acccccgccg 
t ga aggag c c 
accattattg 
caaaaatget 
cactctcaat 
cactagtgga 
cgtgaaatca 
attgagcagc 
agttttaacg 
cagegegaag 
gcggtcactc 
ggctataege 
atcacagttt 
accgacgaaa 
atccatcgca 
gtgacgcatg 
ggtgatgtca 
accageggga 
tatgaactgt 
ggcatccggt 
tttactggct 
ctgatggtgc 
cattaccctt 
gatgaaactg 
aagccgaaag 
caggegatta 
attgecaacg 
gaagcaaege 
gacgctcaca 
ggttggtatg 
ctggcctggc 
ttagceggge 
ctggatatgt 
aatttcgccg 
atcttcaccc 
ggcatgaact 
ggcgcaccat 
gttcaaacat 
ttatcatata 
cgttatttat 



10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
11580 
11640 
11700 
11760 
11820 
11880 
11940 
12000 
12060 
12120 
12180 
12240 
12300 
12360 
12420 
12480 
12540 
12600 
12660 
12720 
12780 
12840 
12900 
12960 
13020 
13080 
13140 
13200 
13260 
13320 
13380 
13440 
13500 
13560 
13620 
13680 
13740 
13800 
13860 
13920 
13980 
14040 
14100 
14160 



WO 2002/096923 



PCT/US2002/017451 



-28- 

gagatgggtt tttatgatta gagtcccgca attatacatt taatacgcga tagaaaacaa 14220 

aatatagcgc gcaaactagg ataaattatc gcgcgcggtg tcatctatgt tactagatcg 14280 

ggaattcgat atcaagcttg gcactggccg tcgttttaca acgtcgtgac tgggaaaacc 14340 

ctggcgttac ccaacttaat cgccttgcag cacafcccccc tttcgccagc tggcgtaata 14400 

gcgaagaggc ccgcaccgat cgcccttccc aacagttgcg cagcctgaat ggcgaatgct 14460 

agagcagctt gagcttggat cagattgtcg tttcccgcct tcagtttaaa ctatcagtgt 14520 

ttgacaggat atattggcgg gtaaacctaa gagaaaagag cgtttattag aataacggat 14580 

atttaaaagg gcgtgaaaag gtttatccgt tcgtccattt gtatgtg ~~ 14 627 

<210> 22 
<211> 4257 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pPUR Plasmid 
<400> 22 

ctgtggaatg tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt 60 
atgcaaagca tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca 12 0 
gcaggcagaa gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta 180 
actccgccca tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga 24 0 
ctaatttttt ttatttatgc agaggccgag gccgcctcgg cctctgagct attccagaag 300 
tagtgaggag gcttttttgg aggcctaggc ttttgcaaaa agcttgcatg cctgcaggtc 3 60 
ggccgccacg accggtgccg ccaccatccc ctgacccacg cccctgaccc ctcacaagga 42 0 
gacgaccttc catgaccgag tacaagccca cggtgcgcct cgccacccgc gacgacgtcc 480 
cccgggccgt acgcaccctc gccgccgcgt tcgccgacta ccccgccacg cgccacaccg 54 O 
tcgacccgga ccgccacatc gagcgggtca ccgagctgca agaactcttc ctcacgcgcg 600 
tcgggctcga catcggcaag gtgtgggtcg cggacgacgg cgccgcggtg gcggtctgga 660 
ccacgccgga gagcgtcgaa gcgggggcgg tgttcgccga gatcggcccg cgcatggccg 72 0 
agttgagcgg ttcccggctg gccgcgcagc aacagatgga aggcctcctg gcgccgcacc 7 80 
ggcccaagga gcccgcgtgg ttcctggcca ccgtcggcgt ctcgcccgac caccagggca 84 0 
a 999tctggg cagcgccgtc gtgctccccg gagtggaggc ggccgagcgc gccggggtgc 90 0 
ccgccttcct ggagacctcc gcgccccgca acctcccctt ctacgagcgg ctcggcttca 960 
ccgtcaccgc cgacgtcgag gtgcccgaag gaccgcgcac ctggtgcatg acccgcaagc 1020 
ccggtgcctg acgcccgccc cacgacccgc agcgcccgac cgaaaggagc gcacgacccc 10 80 
atggctccga ccgaagccga cccgggcggc cccgccgacc ccgcacccgc ccccgaggcc 1140 
caccgactct agaggatcat aatcagccat accacatttg tagaggtttt acttgcttta 1200 
aaaaacctcc cacacctccc cctgaacctg aaacataaaa tgaatgcaat tgttgttgtt 1260 
aacttgttta ttgcagctta taatggttac aaataaagca atagcatcac aaatttcaca 1320 
aataaagcat ttttttcact gcattctagt tgtggtttgt ccaaactcat caatgtatct 1380 
tatcatgtct ggatccccag gaagctcctc tgtgtcctca taaaccctaa cctcctctac 1440 
ttgagaggac attccaatca taggctgccc atccaccctc tgtgtcctcc tgttaattag 1500 
gtcacttaac aaaaaggaaa ttgggtaggg gtttttcaca gaccgctttc taagggtaat 1560 
tttaaaatat ctgggaagtc ccttccactg ctgtgttcca gaagtgttgg taaacagccc 1620 
acaaatgtca acagcagaaa catacaagct gtcagctttg cacaagggcc caacaccctg 1680 
ctcatcaaga agcactgtgg ttgctgtgtt agtaatgtgc aaaacaggag gcacattttc 174 0 
cccacctgtg taggttccaa aatatctagt gttttcattt ttacttggat caggaaccca 18O0 
gcactccact ggataagcat tatccttatc caaaacagcc ttgtggtcag tgttcatctg 1860 
ctgactgtca actgtagcat tttttggggt tacagtttga gcaggatatt tggtcctgta 1920 
gtttgctaac acaccctgca gctccaaagg ttccccacca acagcaaaaa aatgaaaatt 1980 
tgacccttga atgggttttc cagcaccatt ttcatgagtt ttttgtgtcc ctgaatgcaa 2040 
gtttaacata gcagttaccc caataacctc agttttaaca gtaacagctt cccacatcaa 2100 
aatatttcca caggttaagt cctcatttaa attaggcaaa ggaattcttg aagacgaaag 2160 
ggcctcgtga tacgcctatt tttataggtt aatgtcatga taataatggt ttcttagacg 222 0 
tcaggtggca cttttcgggg aaatgtgcgc ggaaccccta tttgtttatt tttctaaata 2280 
cattcaaata tgtatccgct catgagacaa taaccctgat aaatgcttca ataatattga 2340 
aaaaggaaga gtatgagtat tcaacatttc cgtgtcgccc ttattccctt ttttgcggca 2400 
ttttgccttc ctgtttttgc tcacccagaa acgctggtga aagtaaaaga tgctgaagat 2460 
cagttgggtg cacgagtggg ttacatcgaa ctggatctca acagcggtaa gatccttgag 252 0 
agttttcgcc ccgaagaacg ttttccaatg atgagcactt ttaaagttct gctatgtggc 2580 
gcggtattat cccgtgttga cgccgggcaa gagcaactcg gtcgccgcat acactattct 2640 
cagaatgact tggttgagta ctcaccagtc acagaaaagc atcttacgga tggcatgaca 2700 
gtaagagaat tatgcagtgc tgccataacc atgagtgata acactgcggc caacttactt 2760 
ctgacaacga tcggaggacc gaaggagcta accgcttttt tgcacaacat gggggatcat 2820 
gtaactcgcc ttgatcgttg ggaaccggag ctgaatgaag ccataccaaa cgacgagcgt 2880 
gacaccacga tgcctgcagc aatggcaaca acgttgcgca aactattaac tggcgaacta 294 0 
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cttactctag 
ccacttctgc 
gagcgtgggt 
gtagttatct 
gagataggtg 
ctttagattg 
gataatctca 
gtagaaaaga 
caaacaaaaa 
ctttttccga 
tagccgtagt 
ctaatcctgt 
tcaagacgat 
cagcccagct 
gaaagcgcca 
ggaacaggag 
gtcgggtfctc 
agcctatgga 
tttgctcaca 
tttgagtgag 
gaggaagcgg 
caccgcatat 



cttcccggca 
gctcggccct 
ctcgcggtat 
acacgacggg 
cctcactgat 
atttaaaact 
tgaccaaaat 
tcaaaggatc 
aaccaccgct 
aggtaactgg 
taggccacca 
taccagtggc 
agttaccgga 
tggagcgaac 
cgcttcccga 
agcgcacgag 
gccacctctg 
aaaacgccag 
tgttctttcc 
cfcgataccgc 
aagagcgcct 
ggtgcactct 



acaattaata 
tccggctggc 
cattgcagca 
gagtcaggca 
taagcattgg 
tcatttttaa 
cccttaacgt 
ttcttgagat 
accagcggtg 
cttcagcaga 
cttcaagaac 
tgctgccagt 
taaggcgcag 
gacctacacc 
agggagaaag 
ggagcttcca 
acttgagcgt 
caacgcggcc 
tgcgttatcc 
tcgccgcagc 
gatgcggtat 
cagtacaatc 



-29- 

gactggatgg 
tggfcttattg 
ctggggccag 
actatggatg 
taactgtcag 
tttaaaagga 
gagttttcgt 
cctttttttc 
gtttgtttgc 
gcgcagatac 
tctgtagcac 
ggcgataagt 
cggtcgggct 
gaactgagat 
gcggacaggt 

gggggaaacg 

cgatttttgt 
tttttacggt 
cctgattctg 
cgaacgaccg 
tttctcctta 
tgctctgatg 



aggcggataa 
ctgataaatc 
atggtaagcc 
aacgaaatag 
accaagttta 
tctaggtgaa 
tccactgagc 
tgcgcgtaat 
cggatcaaga 
caaatactgt 
cgcctacata 
cgtgtcttac 
gaacgggggg 
acctacagcg 
atccggtaag 
cctggtatct 
gatgctcgtc 
tcctggcctt 
tggataaccg 
agcgcagcga 
cgcatctgtg 
ccgcatagtt 



agttgcagga 
tggagccggt 
ctcccgtatc 
acagatcgct 
ctcatatata 
gatccttttt 
gtcagacccc 
ctgctgcttg 
gctaccaact 
ccttctagtg 
cctcgctctg 
cgggttggac 
ttcgtgcaca 
tgagctatga 
cggcagggtc 
ttatagtcct 
aggggggcgg 
ttgctggcct 
tattaccgcc 
gtcagtgagc 
cggtatttca 
aagccag 



3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4257 



<210> 23 

<211> 2713 

<212> DNA 

<213> Artificial Sequence 



<220> 

<223> pNEB193 Plasmid 



<400> 23 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgcg 

attcgccatt 

tacgccagct 

tttcccagtc 

gcgccggatc 

gcgtaatcat 

aacatacgag 

acattaattg 

cattaatgaa 

tcctcgctca 

tcaaaggcgg 

gcaaaaggcc 

aggctccgcc 

c cgacaggac 

gttccgaccc 

ctttctcata 

ggctgtgtgc 

cttgagtcca 

attagcagag 

ggctacacta 

aaaagagttg 

gtttgcaagc 

tctacggggt 

ttatcaaaaa 

taaagtatat 

atctcagcga 

actacgatac 

cgctcaccgg 

agtggtcctg 

gtaagtagtt 

gtgtcacgct 

gttacatgat 



cggtgatgac 
gtaagcggat 
tcggggctgg 
gtgtgaaata 
caggctgcgc 
ggcgaaaggg 
acgacgttgt 
cttaattaag 
ggtcatagct 
ccggaagcat 
cgttgcgctc 
tcggccaacg 
ctgactcgct 
taatacggtt 
agcaaaaggc 
cccctgacga 
tataaagata 
tgccgcttac 
gctcacgctg 
acgaaccccc 
acccggtaag 
cgaggtatgt 
gaaggacagt 
gtagctcttg 
agcagattac 
ctgacgctca 
ggatcttcac 
atgagtaaac 
tctgtctatt 
999agggctt 
ctccagattt 
caactttatc 
cgccagttaa 
cgtcgtttgg 
cccccatgtt 



ggtgaaaacc 
gccgggagca 
cttaactatg 
ccgcacagat 
aactgttggg 
ggatgtgctg 
aaaacgacgg 
tctagagtcg 
gtttcctgtg 
aaagtgtaaa 
actgcccgct 
cgcggggaga 
gcgctcggtc 
atccacagaa 
caggaaccgt 
gcatcacaaa 
ccaggcgttt 
cggatacctg 
taggtatctc 
cgttcagccc 
acacgactta 
aggcggtgct 
atttggtatc 
atccggcaaa 
gcgcagaaaa 
gtggaacgaa 
ctagatcctt 
ttggtctgac 
tcgttcatcc 
accatctggc 
atcagcaata 
cgcctccatc 
tagtttgcgc 
tatggcttca 
gtgcaaaaaa 



tctgacacat 
gacaagcccg 
cggcatcaga 
gcgtaaggag 
aagggcgatc 
caaggcgatt 
ccagtgaatt 
actgtttaaa 
tgaaattgtt 
gcctggggtg 
ttccagtcgg 
ggcggtttgc 
gttcggctgc 
tcaggggata 
aaaaaggccg 
aatcgacgct 
ccccctggaa 
tccgcctttc 
agttcggtgt 
gaccgctgcg 
tcgccactgg 
acagagttct 
tgcgctctgc 
caaaccaccg 
aaaggatctc 
aactcacgtt 
ttaaattaaa 
agttaccaat 
atagttgcct 
cccagtgctg 
aaccagccag 
cagtctatta 
aacgttgttg 
ttcagctccg 
gcggttagct 



gcagctcccg 
tcagggcgcg 
gcagattgta 
aaaataccgc 
ggtgcgggcc 
aagttgggta 
cgagctcggt 
cctgcaggca 
atccgctcac 
cctaatgagt 
gaaacctgtc 
gtattgggcg 
ggcgagcggt 
acgcaggaaa 
cgttgctggc 
caagtcagag 
gctccctcgt 
tcccttcggg 
aggtcgttcg 
ccttatccgg 
cagcagccac 
tgaagtggtg 
tgaagccagt 
ctggtagcgg 
aagaagatcc 
aagggatttt 
aatgaagttt 
gcttaatcag 
gactccccgt 
caatgatacc 
ccggaagggc 
attgctgccg 
ccattgctac 
gttcccaacg 
ccttcggtcc 



gagacggt ca 
tcagcgggtg 
ctgagagtgc 
atcaggcgcc 
tcttcgctat 
acgccagggt 
acccgggggc 
tgcaagcttg 
aattccacac 
gagctaactc 
gtgccagctg 
ctcttccgct 
atcagctcac 
gaacatgtga 
gtttttccat 
gtggcgaaac 
gcgctctcct 
aagcgtggcg 
ctccaagctg 
taactatcgt 
tggtaacagg 
gcctaactac 
taccttcgga 
tggttttttt 
tttgatcttt 
ggtcatgaga 
taaatcaatc 
tgaggcacct 
cgtgtagata 
gcgagaccca 
cgagcgcaga 
ggaagctaga 
aggcatcgtg 
atcaaggcga 
tccgatcgtt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

9O0 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 
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gtcagaagta 
cttactgtca 
ttctgagaat 
accgcgccac 
aaactctcaa 
aactgatctt 
caaaatgccg 
ctttttcaat 
gaatgtattt 
cctgacgtct 
aggccctttc 



agttggccgc 
tgccatccgt 
agtgtatgcg 
atagcagaac 
ggatcttacc 
cagcatcttt 
caaaaaaggg 
attattgaag 
agaaaaataa 
aagaaaccat 
gtc 



agtgttatca 
aagatgcttt 
gcgaccgagt 
tttaaaagtg 
gctgttgaga 
tactttcacc 
aat aagggcg 
catttatcag 
acaaataggg 
tattatcatg 



-30- 

ctcatggtta 
tctgtgactg 
tgctcttgcc 
ctcatcattg 
tccagttcga 
agcgtttctg 
acacggaaat 
ggttattgtc 
gttccgcgca 
acattaacct 



tggcagcact 
gtgagtactc 
cggcgtcaat 
gaaaacgttc 
tgtaacccac 
ggtgagcaaa 
gttgaatact 
tcatgagcgg 
catttccccg 
ataaaaatag 



gcataattct 
aaccaagtca 
acgggataat 
ttcggggcga 
tcgtgcaccc 
aacaggaagg 
catactcttc 
atacatattt 
aaaagtgcca 
gcgtatcacg 



2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2713 



<210> 24 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPUP Primer 
<400> 24 

ccttgcgcta atgctctgtt acagg 

<210> 25 

<211> 26 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPDWN Primer 

<400> 25 

cagaggcagg gagtgggaca aaattg 

<210> 26 
<211> 4346 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pS V4 0 1 9 3 at t Psens e PUR Plasmid 



<400> 26 

ccggtgccgc 

atgaccgagt 

cgcaccctcg 

cgccacatcg 

atcggcaagg 

agcgtcgaag 

tcccggctgg 

cccgcgtggt 

agcgccgtcg 

gagacctccg 

gacgtcgagg 

cgcccgcccc 

cgaagccgac 

gaggatcata 

acacctcccc 

tgcagcttat 

tttttcactg 

gatccgcgcc 

gcttggcgta 

cacacaacat 

aactcacatt 

agctgcatta 

ccgcttcctc 

ctcactcaaa 



caccatcccc 
acaagcccac 
ccgccgcgtt 
a gcgggtcac 
tgtgggtcgc 
cgggggcggt 
ccgcgcagca 
tcctggccac 
tgctccccgg 
cgccccgcaa 
tgcccgaagg 
acgacccgca 

cc gggcggcc 

atcagccata 
ctgaacctga 
aatggttaca 
cattctagtt 
ggatccttaa 
atcatggtca 
acgagccgga 
aattgcgttg 
atgaatcggc 
gctcactgac 
ggcggtaata 



tgacccacgc 

ggtgcgcctc 

cgccgactac 

cgagctgcaa 

ggacgacggc 

gttcgccgag 

acagatggaa 

cgtcggcgtc 

agtggaggcg 

cctccccttc 

accgcgcacc 

gcgcccgacc 

ccgccgaccc 

ccacatttgt 

aacataaaat 

aataaagcaa 

gtggtttgtc 

ttaagtctag 

tagctgtttc 

agcataaagt 

cgctcactgc 

caacgcgcgg 

tcgctgcgct 

cggttatcca 



ccctgacccc 
gccacccgcg 
cccgccacgc 
gaactcttcc 
gccgcggtgg 
atcggcccgc 
ggcctcctgg 
tcgcccgacc 
gccgagcgcg 
tacgagcggc 
tggtgcatga 
gaaaggagcg 
cgcacccgcc 
agaggtttta 
gaatgcaatt 
tagcatcaca 
caaactcatc 
agtcgactgt 
ctgtgtgaaa 
gtaaagcctg 
ccgctttcca 

ggagaggcgg 
cggtcgttcg 
cagaatcagg 



tcacaaggag 

acgacgtccc 

gccacaccgt 

tcacgcgcgt 

cggtctggac 

gcatggccga 

cgccgcaccg 

accagggcaa 

ccggggtgcc 

tcggcttcac 

cccgcaagcc 

cacgacccca 

cccgaggccc 

cttgctttaa 

gttgttgtta 

aatttcacaa 

aatgtatctt 

t taaacctgc 

ttgttatccg 

gggtgcctaa 

gtcgggaaac 

tttgcgtatt 

gctgcggcga 

ggataacgca 



acgaccttcc 
ccgggccgta 
cgacccggac 
cgggctcgac 
cacgccggag 
gttgagcggt 
gc c caaggag 
gggtctgggc 
cgccttcctg 
cgtcaccgcc 
cggtgcctga 
tggctccgac 
accgactcta 
aaaacctccc 
acttgtttat 
ataaagcatt 
atcatgtctg 
aggcatgcaa 
ctcacaattc 
tgagtgagct 
ctgtcgtgcc 
gggcgctctt 

gcggtatcag 
ggaaagaaca 



25 



26 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

12O0 

1260 

1320 

1380 

1440 
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-31 



tgtgagcaaa 
tccataggct 
gaaacccgac 
ctcctgttcc 
tggcgctttc 
agctgggctg 
atcgtcttga 
acaggattag 
actacggcta 
tcggaaaaag 
tttttgtttg 
tcttttctac 
tgagattatc 
caatctaaag 
cacctatctc 
agataactac 
acccacgctc 
gcagaagtgg 
ctagagtaag 
tcgtggtgtc 
ggcgagttac 
tcgttgtcag 
attctcttac 
agtcattctg 
ataataccgc 
ggcgaaaact 
cacccaactg 
gaaggcaaaa 
tcttcctttt 
tatttgaatg 
tgccacctga 
tcacgaggcc 
agctcccgga 
agggcgcgtc 
agattgtact 
aafcaccgcat 

tgcgggcctc 

gttgggtaac 
agctgtggaa 
gtatgcaaag 
cagcaggcag 
taactccgcc 
gactaatttt 
agtagfcgagg 
tcactaatac 
tatgtagtct 
gtttctcgtt 
tgttgcaacg 
cccactccct 



aggccagcaa 
ccgcccccct 
aggactataa 
gaccctgccg 
tcatagctca 
tgtgcacgaa 
gtccaacccg 
cagagcgagg 
cactagaagg 
agttggtagc 
caagcagcag 
ggggtctgac 
aaaaaggatc 
tatatatgag 
agcgatctgt 
gatacgggag 
accggctcca 
tcctgcaact 
tagttcgcca 
acgctcgtcg 
atgatccccc 
aagtaagttg 
tgtcatgcca 
agaatagtgt 
gccacatagc 
ctcaaggatc 
atcttcagca 
tgccgcaaaa 
tcaatattat 
tatttagaaa 
cgtctaagaa 
ctttcgtctc 
gacggtcaca 
agcgggtgtt 
gagagtgcac 
caggcgccat 
ttcgctatta 
gccagggttt 
tgfcgtgtcag 
catgcatctc 
aagtatgcaa 
catcccgccc 
ttttatttat 
aggctttttt 
catctaagta 
gttttttatg 
cagctttttt 
aacaggtcac 
gcctctgggg 



aaggccagga 
gacgagcatc 
agataccagg 
cttaccggat 
cgctgtaggt 
ccccccgttc 
gtaagacacg 
tatgtaggcg 
acagtatttg 
tcttgatccg 
attacgcgca 
gctcagtgga 
ttcacctaga 
taaacttggt 
ctatttcgtt 
ggcttaccat 
gatttatcag 
ttatccgcct 
gttaatagtt 
tttggtatgg 
atgttgtgca 
gccgcagtgt 
tccgtaagat 
atgcggcgac 
agaactttaa 
ttaccgctgt 
tcttttactt 
aagggaataa 
tgaagcattt 
aataaacaaa 
accattatta 
gcgcgtttcg 
gcttgtctgt 
ggcgggtgtc 
catatgcggt 
tcgccattca 
cgccagctgg 
tcccagtcac 
ttagggtgtg 
aattagtcag 
agcatgcatc 
ctaactccgc 
gcagaggccg 
ggaggctcgg 
gttgattcat 
caaaatctaa 
atactaagtt 
tatcagtcaa 
ggcgcg 



accgtaaaaa 
acaaaaatcg 
cgtttccccc 
acctgtccgc 
atctcagttc 
agcccgaccg 
acttatcgcc 
gtgctacaga 
gtatctgcgc 
gcaaacaaac 
gaaaaaaagg 
acgaaaactc 
tccttttaaa 
ctgacagtta 
catccatagt 
ctggccccag 
caataaacca 
ccatccagtc 
tgcgcaacgt 
cttcattcag 
aaaaagcggt 
tatcactcat 
gcttttctgt 
cgagttgctc 
aagtgctcat 
tgagatccag 
tcaccagcgt 
gggcgacacg 
atcagggtta 
taggggttcc 
tcatgacatt 
gtgatgacgg 
aagcggatgc 
ggggctggct 
gtgaaatacc 
ggctgcgcaa 
cgaaaggggg 
gacgttgtaa 
gaaagtcccc 
caaccaggtg 
tcaattagtc 
ccagttccgc 
aggccgcctc 
tacccccttg 
agtgactgca 
tttaatatat 
ggcattataa 
aataaaatca 



ggccgcgttg 
acgctcaagt 
tggaagcfccc 
ctttctccct 
ggtgtaggtc 
ctgcgcctta 
actggcagca 
gttcttgaag 
tctgctgaag 
caccgctggt 
atctcaagaa 
acgttaaggg 
ttaaaaatga 
ccaatgctta 
tgcctgactc 
tgctgcaatg 
gccagccgga 
tattaattgt 
tgttgccatt 
ctccggttcc 
tagctccttc 
ggttatggca 
gactggtgag 
ttgcccggcg 
cattggaaaa 
ttcgatgtaa 
ttctgggtga 
gaaatgttga 
ttgtctcatg 
gcgcacattt 
aacctataaa 
tgaaaacctc 
cgggagcaga 
taactatgcg 
gcacagatgc 
ctgttgggaa 
atgtgctgca 
aacgacggcc 
aggctcccca 
fcggaaagtcc 
agcaaccata 
ccattctccg 
ggcctctgag 
cgctaatgct 
tatgttgtgt 
tgatatttat 
aaaagcattg 
ttatttgatt 



ctggcgtttt 
cagaggtggc 
ctcgtgcgct 

tc 99rgaagcg 
gttcgctcca 
tccggtaact 
gccactggta 
tggtggccta 
ccagttacct 
agcggtggtt 
gatcctttga 
attttggtca 
agttttaaat 
atcagtgagg 
cccgtcgtgt 
ataccgcgag 
agggccgagc 
tgccgggaag 
gctacaggca 
caacgatcaa 
ggtcctccga 
gcactgcata 
tactcaacca 
tcaatacggg 
cgttcttcgg 
cccactcgtg 
gcaaaaacag 
atactcatac 
agcggataca 
ccccgaaaag 
aataggcgta 
tgacacatgc 
caagcccgtc 
gcatcagagc 
gtaaggagaa 
gggcgatcgg 
aggcgattaa 
agtgaattcg 
gcaggcagaa 
ccaggctccc 
gtcccgcccc 
ccccatggct 
ctattccaga 
ctgttacagg 
tttacagtat 
atcattttac 
cttatcaatt 
tcaattttgt 



1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
306O 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4346 



<210> 27 
<211> 5855 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> pCXLamlntR Plasmid 



<400> 27 

gtcgacattg 

gcccatatat 

ccaacgaccc 

ggactttcca 

atcaagtgta 

cctggcatta 

tattagtcat 

atctcccccc 



attattgact 
ggagttccgc 
ccgcccattg 
ttgacgtcaa 
tcatatgcca 
tgcccagtac 
cgctattacc 
cctccccacc 



agttattaat 
gttacataac 
acgtcaataa 
tgggtggact 
agtacgcccc 
atgaccttat 
atgggtcgag 
cccaattttg 



agtaatcaat 
ttacggtaaa 
tgacgtatgt 
atttacggta 
ctattgacgt 
gggactttcc 
gtgagcccca 
tatttattta 



tacggggtca 
tggcccgcct 
tcccatagta 
aactgcccac 
caatgacggt 
tacttggcag 
cgttctgctt 
ttttttaatt 



ttagttcata 60 

ggctgaccgc 12 0 

acgccaatag 180 

ttggcagtac 24 0 

aaatggcccg 3 00 

tacatctacg 360 

cactctcccc 420 

attttgtgca 480 
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gcgatggggg 
gggcggggcg 
tccttttatg 
gggagtcgct 
ccggctctga 
gggctgtaat 
ccttaaaggg 
tgtgtgtgtg 
cgggcgcggc 
ggtgccccgc 
tgggggggtg 
cctccccgag 
gcggggctcg 
ccgcctcggg 
gtcgaggcgc 
gacttccttt 
tagcgggcgc 
cgtgcgtcgc 
acggctgcct 
gctctagagc 
acgtgctggt 
gtcatgagcg 
acagggaccc 
ctgaagctat 
cgagaatcaa 
tcctggccag 
caataaggag 
caatgctcaa 
cactgagcga 
ctgccactcg 
tgaaaattta 
ctgttgttac 
atggatatct 
tgcat attga 
ttggcggaga 
caaggtattt 
cctttcacga 
ttgctcaaca 
gaggcaggga 
cctatcagaa 
tttttccctc 
gctaataaag 
tcggaaggac 
gtttggcaac 
cagtatatga 
ggttagattt 
tccttacatg 
gtccctcttc 
atagctgttt 
aagcataaag 
gcgctcactg 
tagtcagcaa 
tccgcccatt 
gcctcggcct 
tgcaaaaagc 
caaatttcac 
tcaatgtatc 
aggcggtttg 
cgttcggctg 
atcaggggat 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 



cggggggggg 
aggcggagag 
gcgaggcggc 
gcgttgcctt 
ctgaccgcgt 
tagcgcttgg 
ctccgggagg 
cgtggggagc 
gcggggcttt 
ggtgcggggg 
agcagggggt 
ttgctgagca 
ccgtgccggg 
ccggggaggg 
ggcgagccgc 
gfccccaaafcc 
gggcgaagcg 
cgcgccgccg 
tcggggggga 
ctctgctaac 
tgfctgtgctg 
ccgggattta 
aaggacgggt 
acaggccaac 
cagtgataat 
cagaggaatc 
gggtctgcct 
tggatacata 
tgcattccga 
cgcagcaaaa 
tcaagcagca 
cgggcaacga 
ttatgtcgag 
tgctctcgga 
aaccataatt 
tatgcgcgca 
gttgcgcagt 
tcttctcggg 
gtgggacaaa 
ggtggtggct 
tgccaaaaat 
gaaatttatt 
atatgggagg 
atatgccata 
aacagccccc 
tttttatatt 
ttttactagc 
tcttatgaag 
cctgtgtgaa 
tgtaaagcct 
cccgctttcc 
ccatagtccc 
ctccgcccca 
ctgagctatt 
taacttgttt 
aaataaagca 
ttatcatgtc 
cgtattgggc 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 
gcagcagcca 



gggggcgcgc 

gtgcggcggc 
ggcggcggcg 
cgccccgtgc 
tactcccaca 
tttaatgacg 
gccctttgtg 
gccgcgfcgcg 
gtgcgctccg 
ggctgcgagg 
gtgggcgcgg 
cggcccggct 
cggggggtgg 
ctcgggggag 
agccat tgcc 
tggcggagcc 
gtgcggcgcc 
tccccttctc 
cggggcaggg 
catgttcatg 
tctcatcatt 
ccccctaacc 
aaagagtttg 
attgagttat 
tccgttacgt 
aagcagaaga 
gatgctccac 
gacgagggca 
gaggcaatag 
tctagagtaa 
gaatcatcac 
gttggtgatt 
caaagcaaaa 
atatcaatga 
gcatctactc 
cgaaaagcat 
ttgtctgcaa 
cataagtcgg 
attgaaatca 
ggtgtggcca 
tatggggaca 
ttcattgcaa 
gcaaatcatt 
tgctggctgc 
tgctgtccat 
ttgttttgtg 
cagatttttc 
atccctcgac 
attgttatcc 
ggggtgccta 
agtcgggaaa 
gcccctaact 
tggctgacta 
ccagaagtag 
attgcagctt 
tttttttcac 
tggatccgct 
gctcttccgc 
tatcagctca 
agaacatgtg 
cgtttttcca 
ggtggcgaaa 
tgcgctctcc 
gaagcgtggc 
gctccaagct 
gtaactatcg 
ctggtaacag 



gccaggcggg 
agccaatcag 
gccctataaa 
cccgctccgc 
ggtgagcggg 
gctcgtttct 
cgggggggag 
gcccgcgctg 
cgtgfcgcgcg 
ggaacaaagg 
cggtcgggct 
tcgggtgcgg 
cggcaggtgg 
gggcgcggcg 
ttttatggta 
gaaatctggg 
ggcaggaagg 
catctccagc 
cggggttcgg 
ccttcttctt 
ttggcaaaga 
tttatataag 
gattaggcag 
tttcaggaca 
tacattcatg 
cactcataaa 
ttgaagacat 
aggcggcgtc 
ctgaaggcca 
ggagatcaag 
catgttggct 
tatgcgaaat 
caggcgtaaa 
aggaaacact 
gtcgcgaacc 
caggtctttc 
gactctatga 
acaccatggc 
aataagaatt 
atgccctggc 
tcatgaagcc 
tagtgtgttg 
taaaacatca 
catgaacaaa 
tccttattcc 
ttattttttt 
ctcctctcct 
ctgcagccca 
gctcacaatt 
atgagtgagc 
cctgtcgtgc 
ccgcccatcc 
atttttttta 
tgaggaggct 
ataatggtta 
tgcattctag 
gcattaatga 
ttcctcgctc 
ctcaaaggcg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcaa 
gggctgtgtg 
tcttgagtcc 
gattagcaga 



gcggggcggg 
agcggcgcgc 
aagcgaagcg 
gccgcctcgc 
cgggacggcc 
tttctgtggc 
cggctcgggg 
cccggcggct 
aggggagcgc 
ctgcgtgcgg 
gtaacccccc 
ggctccgtgc 

gggtgccggg 

gccccggagc 
atcgtgcgag 
aggcgccgcc 
aaatgggcgg 
ctcggggctg 
cttctggcgt 
tttcctacag 
attcatggga 
aaacaatgga 
agacaggcga 
caaacacaag 
gcttgatcgc 
ttacatgagc 
caccacaaaa 
agccaagtta 
tataacaaca 
acttacggct 
cagacttgca 
gaagtggtct 
aattgccatc 
tgataaatgc 
gctttcatcc 
cttcgaaggg 
gaagcagata 
atcacagtat 
cactcctcag 
tcacaaatac 
ccttgagcat 
gaattttttg 
gaatgagtat 
ggtggctata 
atagaaaagc 
ctttaacatc 
gactactccc 
agcttggcgt 
ccacacaaca 
taactcacat 
cagcggatcc 
cgcccctaac 
tttatgcaga 
tttttggagg 
caaataaagc 
ttgtggtttg 
atcggccaac 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
tgctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 



gcgaggggcg 
tccgaaagtt 
cgcggcgggc 
gccgcccgcc 
cttctcctcc 
tgcgtgaaag 
ggtgcgtgcg 
gtgagcgctg 
ggccgggggc 
ggtgtgtgcg 
cctgcacccc 

ggggcgtggc 

cggggcgggg 
gccggcggct 

a ggg c g ca gg 

gcaccccctc 
ggagggcctt 
ccgcaggggg 
gtgaccggcg 
ctcctgggca 
agaaggcgaa 
tattactgct 
atcgcaatca 
cctctgacag 
tacgaaaaaa 
aaaattaaag 
gaaattgcgg 
atcagatcaa 
aaccatgtcg 
gacgaatacc 
atggaactgg 
gatatcgtag 
ccaacagcat 
aaagagattc 
ggcacagtat 
gatccgccta 
agcgataagt 
cgtgatgaca 
gtgcaggctg 
cactgagatc 
ctgacttctg 
tgtctctcac 
ttggtttaga 
aagaggtcat 
cttgacttga 
cctaaaattt 
agtcatagct 
aatcatggtc 
tacgagccgg 
taattgcgtt 
gcatctcaat 
tccgcccagt 
ggccgaggcc 
cctaggcttt 
aatagcatca 
tccaaactca 
gcgcggggag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 



540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 
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tacagagttc 
ctgcgctctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
ccccagtgct 
aaaccagcca 
ccagtctatt 
caacgttgtt 
attcagctcc 
agcggttagc 
actcatggtt 
ttctgtgact 
ttgctcttgc 
gctcatcatt 
atccagttcg 
cagcgtttct 
gacacggaaa 
gggttattgt 
ggttccgcgc 



ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactccccg 
gcaatgatac 
gccggaaggg 
aattgttgcc 
gccattgcta 
ggttcccaac 
tccttcggtc 
atggcagcac 
ggtgagtact 
ccggcgtcaa 
ggaaaacgtt 
atgtaaccca 
gggtgagcaa 

tgttgaatac 

ctcatgagcg 
acatttcccc 



ggcctaacta 
ttaccttcgg 
gtggtttttt 
ctttgatctt 
tggtcatgag 
ttaaatcaat 
gtgaggcacc 
tcgt.gt.agat 
cgcgagaccc 
ccgagcgcag 
gggaagctag 
caggcatcgt 
gatcaaggcg 
ctccgatcgt 
tgcataattc 
caaccaagtc 
tacgggataa 
cttcggggcg 
ctcgtgcacc 
aaacaggaag 
tcatactctt 
gatacatatt 
gaaaagtgcc 



-33- 

cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
aactacgata 
acgctcaccg 
aagtggtcct 
agtaagtagt 
ggtgtcacgc 
agttacatga 
tgtcagaagt 
tcttactgtc 
attctgagaa 
taccgcgcca 
aaaactctca 
caactgatct 
gcaaaatgcc 
cctttttcaa 
tgaatgtatt 
acctg 



agaaggacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cgggagggct 
gctccagatt 
gcaactttat 
tcgccagtta 
tcgtcgtttg 
tcccccatgt 
aagttggccg 
atgccatccg 
tagtgtatgc 
catagcagaa 
aggatcttac 
tcagcatctt 
gcaaaaaagg 
tattattgaa 
tagaaaaata 



tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
taccatctgg 
tatcagcaat 
ccgcctccat 
atagtttgcg 
gtatggcttc 
tgtgcaaaaa 
cagtgttatc 
taagatgctt 
ggcgaccgag 
ctttaaaagt 
cgctgttgag 
ttactttcac 
gaataagggc 
gcatttatca 
aacaaatagg 



<210> 28 
<211> 37 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> 5PacSV4 0 Primer 
<400> 28 

ctgttaatta actgtggaat gtgtgtcagt tagggtg 

<210> 29 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Ant i sense Zeo Primer 



4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5855 



37 



<400> 29 

tgaacagggt cacgtcgtcc 

<210> 30 

<211> 1032 

<212> DNA 

<213> Escherichia Coli 



20 



<220> 

<221> CDS 

<222> (1) . . . (1032) 

<223> nucleotide sequence encoding Cre recombinase 
<400> 30 

atg tec aat tta ctg acc gta cac caa aat ttg cct gca tta ccg gtc 
Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 
15 10 15 



48 



gat gca acg agt gat gag gtt cgc aag aac ctg atg gac atg ttc agg 
Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 
20 25 3 0 



96 



gat cgc cag gcg ttt tct gag cat acc tgg aaa atg ctt ctg tec gtt 144 
Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 
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35 40 45 

tgc egg teg tgg gcg gca tgg tgc aag ttg aat aac egg aaa tgg ttt 192 

Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 
50 55 60 

ccc gca gaa cct gaa gat gtt cgc gat tat ctt eta tat ctt cag gcg 240 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 

65 70 75 80 

cgc ggt ctg gca gta aaa act ate cag caa cat ttg ggc cag eta aac 28 8 

Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 

85 90 95 

atg ctt cat cgt egg tec ggg ctg cca cga cca agt gac age aat get 33 6 

Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 

100 105 110 

gtt tea ctg gtt atg egg egg ate cga aaa gaa aac gtt gat gec ggt 384 

Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 
115 120 125 

gaa cgt gca aaa cag get eta gcg ttc gaa cgc act gat ttc gac cag 432 

Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 
130 135 140 

gtt cgt tea etc atg gaa aat age gat cgc tgc cag gat ata cgt aat 48 0 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 

145 150 155 160 

ctg gca ttt ctg ggg att get tat aac acc ctg tta cgt ata gee gaa 52 8 

Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 

165 170 175 

att gec agg ate agg gtt aaa gat ate tea cgt act gac ggt ggg aga 57 6 

lie Ala Arg lie Arg Val Lys Asp lie Ser Arg Thr Asp Gly Gly Arg 

180 ~ 185 " 190 

atg tta ate cat att ggc aga acg aaa acg ctg gtt age acc gca ggt 624 

Met Leu He His He Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 
195 200 205 

gta gag aag gca ctt age ctg ggg gta act aaa ctg gtc gag cga tgg 672 

Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 
210 215 220 

att tec gtc tct ggt gta get gat gat ccg aat aac tac ctg ttt tgc 72 0 

lie Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 

225 230 235 240 

egg gtc aga aaa aat ggt gtt gee gcg cca tct gec acc age cag eta 76 8 

Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 

245 250 255 

tea act cgc gee ctg gaa ggg att ttt gaa gca act cat cga ttg att 816 

Ser Thr Arg Ala Leu Glu Gly lie Phe Glu Ala Thr His Arg Leu lie 

260 265 270 

tac ggc get aag gat gac tct ggt cag aga tac ctg gec tgg tct gga 864 

Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 
275 280 285 

cac agt gec cgt gtc gga gee gcg cga gat atg gee cgc get gga gtt 912 

His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 
290 295 300 

tea ata ccg gag ate atg caa get ggt ggc tgg acc aat gta aat att 96 0 
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Ser lie Pro Glu lie Met Gin Ala Gly Gly Trp Thr Asn Val Asn lie 
305 310 315 320 

gtc atg aac tat ate cgt aac ctg gat agt gaa aca ggg gca atg gtg 100 8 
Val Met Asn Tyr lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 
325 330 335 

cgc ctg ctg gaa gat ggc gat tag 103 2 

Arg Leu Leu Glu Asp Gly Asp * 
340 

<210> 31 
<211> 343 
<212> PRT 

<213> Escherichia Coli 
<400> 31 

Met Ser Asn Leu Leu Thr Val His Gin Asn Leu Pro Ala Leu Pro Val 

15 10 15 

Asp Ala Thr Ser Asp Glu Val Arg Lys Asn Leu Met Asp Met Phe Arg 

20 25 30 

Asp Arg Gin Ala Phe Ser Glu His Thr Trp Lys Met Leu Leu Ser Val 

35 40 45 

Cys Arg Ser Trp Ala Ala Trp Cys Lys Leu Asn Asn Arg Lys Trp Phe 

50 55 60 

Pro Ala Glu Pro Glu Asp Val Arg Asp Tyr Leu Leu Tyr Leu Gin Ala 
65 70 75 80 

Arg Gly Leu Ala Val Lys Thr lie Gin Gin His Leu Gly Gin Leu Asn 

85 90 95 

Met Leu His Arg Arg Ser Gly Leu Pro Arg Pro Ser Asp Ser Asn Ala 

100 "* 105 110 

Val Ser Leu Val Met Arg Arg lie Arg Lys Glu Asn Val Asp Ala Gly 

115 120 125 

Glu Arg Ala Lys Gin Ala Leu Ala Phe Glu Arg Thr Asp Phe Asp Gin 

130 135 140 

Val Arg Ser Leu Met Glu Asn Ser Asp Arg Cys Gin Asp lie Arg Asn 
145 ISO 155 160 

Leu Ala Phe Leu Gly lie Ala Tyr Asn Thr Leu Leu Arg lie Ala Glu 

165 170 175 

lie Ala Arg lie Arg Val Lys Asp He Ser Arg Thr Asp Gly Gly Arg 

180 185 ~ 190 

Met Leu He His lie Gly Arg Thr Lys Thr Leu Val Ser Thr Ala Gly 

195 200 205 

Val Glu Lys Ala Leu Ser Leu Gly Val Thr Lys Leu Val Glu Arg Trp 

210 215 220 

He Ser Val Ser Gly Val Ala Asp Asp Pro Asn Asn Tyr Leu Phe Cys 
225 230 235 240 

Arg Val Arg Lys Asn Gly Val Ala Ala Pro Ser Ala Thr Ser Gin Leu 

245 250 255 

Ser Thr Arg Ala Leu Glu Gly lie Phe Glu Ala Thr His Arg Leu lie 

260 265 270 

Tyr Gly Ala Lys Asp Asp Ser Gly Gin Arg Tyr Leu Ala Trp Ser Gly 

275 280 285 

His Ser Ala Arg Val Gly Ala Ala Arg Asp Met Ala Arg Ala Gly Val 

290 295 300 

Ser lie Pro Glu lie Met Gin Ala Gly Gly Trp Thr Asn Val Asn lie 
305 310 315 320 

Val Met Asn Tyr lie Arg Asn Leu Asp Ser Glu Thr Gly Ala Met Val 

325 330 335 

Arg Leu Leu Glu Asp Gly Asp 
340 

<210> 32 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> attBl recognition sequence 



<400> 32 

tgaagcctgc ttttttatac taacttgagc gaa 



33 



<210> 33 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-att recognition sequence 

<221> misc_dif f erence 
<222> 18 

<223> n is a or g or c or t/u 
<400> 33 

rkycwgcttt yktrtacnaa stsgb 25 

<210> 34 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attB recognition sequence 

<221> misc_dif ference 
<222> 18 

<223> n is a or c or g or t/u 



<210> 35 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attR recognition sequence 

<221> misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 35 

gttcagcttt clctrtacnaa ctsgb 25 

<210> 36 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attL recognition sequence 

<221> misc_dif ference 
<222> 18 

<223> n is a or g or c or t/u 



<400> 34 

agccwgcttt yktrtacnaa ctsgb 



25 



<400> 36 

agccwgcttt clctrtacnaa gtsgb 



25 



<210> 37 
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<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> m-attPl recognition sequence 

<221> misc_difference 
<222> 18 

<223> n is a or g or c or t/u 
<400> 37 

gttcagcttt yktrtacnaa gtsgb 25 

<210> 38 

<211> 25 

<212=» DNA 

<213> Artificial Sequence 
<220> 

<223> attB2 recognition sequence 

<400> 38 

agcctgcttt cttgtacaaa cttgt 25 

<210> 39 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223:> attB3 recognition sequence 
<400> 39 

acccagcttt cttgtacaaa cttgt 25 

<210> 40 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attRl recognition sequence 
<400> 40 

gttcagcttt tttgtacaaa cttgt 25 

<210> 41 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR2 recognition sequence 
<400> 41 

gttcagcttt cttgtacaaa cttgt 25 

<210> 42 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attR3 recognition sequence 



<400> 42 



WO 2002/096923 PCT/US2002/0 17451 



-38- 

gttcagcttt cttgtacaaa 9ttgg 25 

<210> 43 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attLl recognition sequence 
<400> 43 

agcctgcttt tttgtacaaa gttgg 25 

<210> 44 
<211> 25 
<212> DNA 

<213 > Artificial Sequence 
<220> 

<223> attL2 recognition sequence 
<400> 44 

agcctgcttt cttgtacaaa gttgg 25 

<210> 45 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attL3 recognition sequence 
<400> 45 

acccagcttt cttgtacaaa gttgg 25 

<210> 46 

<211> 25 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attPl recognition sequence 

<400> 46 

gttcagcttt tttgtacaaa gttgg 25 

<210> 47 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP2,P3 recognition sequence 
<400> 47 

gttcagcttt cttgtacaaa gttgg 25 

<210> 48 
<211> 282 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> attP recognition sequence 



<400> 48 

ccttgcgcta atgctctgtt acaggtcact aataccatct aagtagttga ttcatagtga 60 
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ctgcatatgt tgtgttttac agtattatgt agtctgtttt ttatgcaaaa tctaatttaa 120 

tatattgata tttatatcat tttacgtttc tcgttcagct tttttatact aagttggcat 180 

tataaaaaag cattgcttat caatttgttg caacgaacag gtcactatca gtcaaaataa 240 

aatcattatt tgatttcaat tttgtcccac tccctgcctc tg 282 

<210> 49 
<211> 1071 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> nucleotide sequence encoding Integrase E174R 

<221> CDS 

<222> (1) . . . (1071) 

<223> Integrase E174R 

<400> 49 

atg gga aga agg cga agt cat gag cgc egg gat tta ccc cct aac ctt 4 8 

Met Gly Arg Arg Arg Ser His Glu Arg Arg Asp Leu Pro Pro Asn Leu 
15 10 15 

tat at a aga aac aat gga tat tac tgc tac agg gac cca agg acg ggt 96 
Tyr lie Arg Asn Asn Gly Tyr Tyr Cys Tyr Arg Asp Pro Arg Thr Gly 
20 25 30 

aaa gag ttt gga tta ggc aga gac agg cga ate gca ate act gaa get 144 
Lys Glu Phe Gly Leu Gly Arg Asp Arg Arg lie Ala lie Thr Glu Ala 
35 40 45 

ata cag gee aac att gag tta ttt tea gga cac aaa cac aag cct ctg 192 
lie Gin Ala Asn lie Glu Leu Phe Ser Gly His Lys His Lys Pro Leu 
50 55 60 

aca gcg aga ate aac agt gat aat tec gtt acg tta cat tea tgg ctt 24 0 
Thr Ala Arg lie Asn Ser Asp Asn Ser Val Thr Leu His Ser Trp Leu 
65 " 70 ** 75 80 

gat cgc tac gaa aaa ate ctg gee age aga gga ate aag cag aag aca 288 
Asp Arg Tyr Glu Lys He Leu Ala Ser Arg Gly He Lys Gin Lys Thr 
85 90 95 

etc ata aat tac atg age aaa att aaa gca ata agg agg ggt ctg cct 336 
Leu lie Asn Tyr Met Ser Lys lie Lys Ala lie Arg Arg Gly Leu Pro 
100 105 110 

gat get cca ctt gaa gac ate acc aca aaa gaa att gcg gca atg etc 3 84 
Asp Ala Pro Leu Glu Asp lie Thr Thr Lys Glu He Ala Ala Met Leu 
115 120 125 

aat gga tac ata gac gag ggc aag gcg gcg tea gec aag tta ate aga 4 32 
Asn Gly Tyr lie Asp Glu Gly Lys Ala Ala Ser Ala Lys Leu lie Arg 
130 135 140 

tea aca ctg age gat gca ttc cga gag gca ata get gaa ggc cat ata 4 80 
Ser Thr Leu Ser Asp Ala Phe Arg Glu Ala lie Ala Glu Gly His lie 
145 150 155 160 

aca aca aac cat gtc get gee act cgc gca gca aaa tct aga gta agg 528 
Thr Thr Asn His Val Ala Ala Thr Arg Ala Ala Lys Ser Arg Val Arg 
165 170 175 

aga tea aga ctt acg get gac gaa tac ctg aaa att tat caa gca gca 576 
Arg Ser Arg Leu Thr Ala Asp Glu Tyr Leu Lys lie Tyr Gin Ala Ala 
180 185 190 

gaa tea tea cca tgt tgg etc aga ctt gca atg gaa ctg get gtt gtt 624 
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Glu Ser Ser Pro Cys Trp Leu Arg Leu Ala Met Glu Leu Ala Val Val 
195 " 200 205 

acc ggg caa cga gtt ggt gat tta tgc gaa atg aag tgg tct gat ate 672 

Thr Gly Gin Arg Val Gly Asp Leu Cys Glu Met Lys Trp Ser Asp lie 
210 215 220 

gta gat gga tat ctt tat gtc gag caa age aaa aca ggc gta aaa att 72 0 

Val Asp Gly Tyr Leu Tyr Val Glu Gin Ser Lys Thr Gly Val Lys He 

225 230 235 240 

gec ate cca aca gca ttg cat att gat get etc gga ata tea atg aag 768 

Ala He Pro Thr Ala Leu His lie Asp Ala Leu Gly He Ser Met Lys 
245 250 255 

gaa aca ctt gat aaa tgc aaa gag att ctt ggc gga gaa acc ata att 81^6 

Glu Thr Leu Asp Lys Cys Lys Glu lie Leu Gly Gly Glu Thr lie lie 
260 265 270 

gca tct act cgt cgc gaa ccg ctt tea tec ggc aca gta tea agg tat 864 

Ala Ser Thr Arg Arg Glu Pro Leu Ser Ser Gly Thr Val Ser Arg Tyr 
275 280 285 

ttt atg cgc gca cga aaa gca tea ggt ctt tec ttc gaa ggg gat ccg 912 

Phe Met Arg Ala Arg Lys Ala Ser Gly Leu Ser Phe Glu Gly Asp Pro 
290 295 300 

cct acc ttt cac gag ttg cgc agt ttg tct gca aga etc tat gag aag 96 0 

Pro Thr Phe His Glu Leu Arg Ser Leu Ser Ala Arg Leu Tyr Glu Lys 

305 310 315 320 

1008 



1056 



1071 



eag ata 


age 


gat 


aag 


ttt 


get 


caa 


cat 


ctt 


etc 


ggg 


cat 


aag 


teg 


gac 


Gin He 


Ser 


Asp 


Lys 


Phe 


Ala 


Gin 


His 


Leu 


Leu 


Gly 


His 


Lys 


Ser 


Asp 






325 










330 










335 




acc atg 


gca 


tea 


eag 


tat 


cgt 


gat 


gac 


aga 


ggc 


agg 


gag 


tgg 


gac 


aaa 


Thr Met 


Ala 


Ser 


Gin 


Tyr 


Arg 


Asp 


Asp 


Arg 


Gly 


Arg 


Glu 


Trp 


Asp 


Lys 






340 










345 










350 






att gaa 


ate 


aaa 


taa 
























He Glu 


He 


Lys 


* 


























355 




























<210> 50 




























<211> 356 




























<212> PRT 




























<213> Artificial Sequence 




















<220> 






























<223> Integrase 


E174R 






















<400> 50 




























Met Gly 


Arg 


Arg 


Arg 


Ser 


His 


Glu 


Arg 


Arg 


Asp 


Leu 


Pro 


Pro 


Asn 


Leu 


1 






5 










10 










15 




Tyr He 


Arg 


Asn 


Asn 


Gly 


Tyr 


Tyr 


Cys 


Tyr 


Arg 


Asp 


Pro 


Arg 


Thr 


Gly 






20 










25 










30 






Lys Glu 


Phe 


Gly 


Leu 


Gly 


Arg 


Asp 


Arg 


Arg 


He 


Ala 


He 


Thr 


Glu 


Ala 




35 










40 










45 








He Gin 


Ala 


Asn 


He 


Glu 


Leu 


Phe 


Ser 


Gly 


His 


Lys 


His 


Lys 


Pro 


Leu 


50 










55 










60 










Thr Ala 


Arg 


He 


Asn 


Ser 


Asp 


Asn 


Ser 


Val 


Thr 


Leu 


His 


Ser 


Trp 


Leu 


65 






70 










75 










80 


Asp Arg 


Tyr 


Glu 


Lys 


He 


Leu 


Ala 


Ser 


Arg 


Gly 


He 


Lys 


Gin 


Lys 


Thr 








85 










90 










95 




Leu He 


Asn 


Tyr 


Met 


Ser 


Lys 


He 


Lys 


Ala 


He 


Arg 


Arg 


Gly 


Leu 


Pro 






100 










105 










HO 






Asp Ala 


Pro 


Leu 


Glu 


Asp 


He 


Thr 


Thr 


Lys 


Glu 


He 


Ala 


Ala 


Met 


Leu 
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115 










120 










125 








Asn 


Gly 
130 


Tyr 


He 


Asp 


Glu 


Gly 

135 


Lys 


Ala 


Ala 


Ser 


Ala 
140 


Lys 


Leu 


He 


Arg 


Ser 


Thr 


Leu 


Ser 


Asp 


Ala 


Phe 


Arg 


Glu 


Ala 


He 


Ala 


Glu 


Gly 


His 


He 


145 










150 










155 








160 


Thr 


Thr 


Asn 


His 


Val 


Ala 


Ala 


Thr 


Arg 


Ala 


Ala 


Lys 


Ser 


Arg 


Val 


Arg 










165 










170 








175 




Arg 


Ser 


Arg 


Leu 
180 


Thr 


Ala 


Asp 


Glu 


Tyr 
185 


Leu 


Lys 


He 


Tyr 


Gin 
190 


Ala 


Ala 


Glu 


Ser 


Ser 


Pro 


Cys 


Trp 


Leu 


Arg 


Leu 


Ala 


Met 


Glu 


Leu 


Ala 


Val 


Val 






195 






200 










205 








Thr 


Gly 
210 


Gin 


Arg 


Val 


Gly 


Asp 
215 


Leu 


Cys 


Glu 


Met 


Lys 
220 


Trp 


Ser 


Asp 


He 


Val 


Asp 


Gly 


Tyr 


Leu 


Tyr 


Val 


Glu 


Gin 


Ser 


Lys 


Thr 


Gly 


Val 


Lys 


He 


225 










230 










235 










240 


Ala 


He 


Pro 


Thr 


Ala 


Leu 


His 


He 


Asp 


Ala 


Leu 


Gly 


He 


Ser 


Met 


Lys 










245 










250 








255 


Glu 


Thr 


Leu 


Asp 


Lys 


Cys 


Lys 


Glu 


He 


Leu 


Gly 


Gly Glu 


Thr 


He 


He 








260 










265 










270 






Ala 


Ser 


Thr 


Arg 


Arg 


Glu 


Pro 


Leu 


Ser 


Ser Gly 


Thr 


Val 


Ser 


Arg 


Tyr 






275 










280 










285 








Phe 


Met 
290 


Arg 


Ala 


Arg 


Lys 


Ala 
295 


Ser 


Gly 


Leu 


Ser 


Phe 
300 


Glu 


Gly 


Asp 


Pro 


Pro 


Thr 


Phe 


His 


Glu 


Leu 


Arg 


Ser 


Leu 


Ser 


Ala 


Arg 


Leu 


Tyr 


Glu 


Lys 


305 










310 










315 










320 


Gin 


He 


Ser 


Asp 


Lys 


Phe 


Ala 


Gin 


His 


Leu 


Leu 


Gly His 


Lys 


Ser 


Asp 










325 










330 










335 




Thr 


Met 


Ala 


Ser 


Gin 


Tyr 


Arg 


Asp 


Asp 


Arg Gly Arg Glu 


Trp 


Asp 


Lys 








340 










345 










350 






He 


Glu 


He 
355 


Lys 
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<212> DNA 
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